Abstract
For multidimensional highorder nonlinear systems with unstable path quality in parameter and extension terms, we developed a new fast search random tree strategy. First, we established a highorder Lipschitz vector field dynamic system to adapt to highorder systems of multidegreeoffreedom robots, with the complex obstacle function being one of its key components. Secondly, we designed a classification gap filtering network layer (Classification LSTM) to screen training data models and ensure the global stability of data in path design. Additionally, the visual sensors deployed in the unit area effectively implement the path marking backtracking strategy and dead zone path simplification. Finally, three examples are provided to verify the effectiveness of this design method.
Similar content being viewed by others
Introduction
In recent years, interest in path planning design has increased due to the unstable path planning quality of smart devices in complex environments^{1,2,3}. Rapidlyexploring random tree (RRT)^{4,5,6} is an attractive method for planning design because it can efficiently search the planning space and generate approximately optimal paths, making it suitable for application scenarios with high realtime requirements, such as dynamic environments^{7,8}.
The first paper to use the RRT algorithm in path planning appears to be^{9}. Since then, there has been significant interest in improving RRT algorithms to optimize path planning quality. Lan and Jin^{10} proposed an improved RRT algorithm based on environmental ocean currents, which can adapt to the complex environment of real ocean currents and reduce the length of planned paths. An and Kim^{11} studied an adaptive stepsize RRT path planning algorithm using the maximum Cartesian displacement constraint. The advantage of this algorithm is that it uses a minimum obstacle size parameter and an operatorinduced matrix to determine the maximum allowable step size, offering practical benefits over fixed stepsize RRT planning algorithms. An interesting application^{12} is to use this constraint for parallel trajectory planning in dynamic environments.
In^{13}, a velocity planning method with minimum jitter was proposed to improve the smoothness and stability of mobile robots during path tracking. However, due to the presence of unknown parameters in the observed data, stability is limited. Zhou and Pan^{14} proposed a perceptionadaptive trajectory replanning method called Raptor, which dynamically adjusts the flight trajectory based on environmental perception to cope with sudden obstacles and environmental changes. Nilles and Ren^{15} designed a nondeterministic rebound strategy based on visibility calculation, analyzing the relationship between the robot and the visible area in the environment to design an effective path planning strategy in dynamic and unpredictable environments.
It should be noted that the path planning results of the abovementioned RRT algorithms and other extended technologies are focused on general linear^{16} or nonlinear systems. However, in multidegreeoffreedom robot systems, there are often highorder nonlinear terms^{17}. The parameters and extensions of such highorder nonlinear terms lead to unstable path planning quality. Therefore, it is imperative to study an improved RRT method based on multidimensional highorder nonlinear systems.
Based on these observations, we developed an improved RRT behavior planning method for robots using the PTM algorithm. The PTM algorithm is an optimization algorithm that includes a highorder Lipschitz vector field dynamic system, Classification LSTM, and a limited area sensor deployment design. The design model of the method in this study is shown in Fig. 1.
Figure 1 shows a comprehensive path planning and optimization system involving three main modules: the nonlinear system, the RRT (rapidlyexploring random tree) algorithm, and PTM (path tracking management). The nonlinear system is processed by Markov and complex obstacle functions and optimized using a gap filtering network layer. The RRT module performs path searches through random sampling and multidimensional systems and interacts with the PTM module. The PTM module includes sensor deployment, path marking, and dead zone simplification, and performs path optimization and management through backtracking and matrix generation. The overall system provides a highdimensional and multiorder path planning method through these interconnected submodules. The contributions of this article are threefold:

(1)
For general linear^{16} or nonlinear systems in static environments, in addition to existing results^{18}, we also consider a highorder Lipschitz vector field model. It establishes dynamic systems based on Markov nonlinear analysis. Compared with the fixed step size of^{11}, it has the planning flexibility within the safety limit curve. We also optimize the highorder Lipschitz vector field model based on the kinematics of the Differential Drive Robot (DDR), and simulate the planning process through threedimensional scene experiments to make it more suitable for real mobile robots.

(2)
A Classification LSTM is designed. Although our design is driven by^{12}, due to the existence of nonlinear parameter arbitrary terms and extension terms in multidimensional highorder path planning, the path planning quality is unstable. To overcome this difficulty, we propose a Classification LSTM.

(3)
A deployment method of visual sensors^{19} in a limited area is established. This method can efficiently implement two path optimization strategies: path marking backtracking strategy and dead zone path simplification. Compared with^{9,10,11,12,13,14,15}, it has better extended nodes and planning cycle.
The remainder of this article is organized as follows. “Highorder Lipschitz vector field dynamic system” Section establishes a highorder Lipschitz vector field dynamic system. “Classification LSTM” Section designs a dynamic system based on a Classification LSTM. “Deployment method of visual sensors in a limited area” Section discusses the establishment process of the method for deploying visual sensors in a limited area. In addition, the design principle of the improved RRT is given. “Experiment analysis” Section provides three examples to illustrate the theoretical results. “Conclusion” Section concludes this article and points out the research limitations and future work directions.
Highorder Lipschitz vector field dynamic system
The highorder Lipschitz vector field dynamic system is a mathematical model used to describe and control multidimensional highorder nonlinear systems. It uses the Lipschitz condition to ensure the stability of the system in a dynamic environment. Compared with the traditional Lipschitz vector field^{20}, it can more accurately handle the complex dynamic behavior of multidimensional highorder nonlinear systems.
Higherorder Lipschitz vector field theory based on Markov model
The Markov model^{21} is a statistical decision model. It is introduced into the highorder Lipschitz vector field theory to handle uncertainty and dynamic environments in path planning. This is because most actual planning scenarios are dynamically changing, including the movement of equipment, people, and other unpredictable factors. The Markov model can effectively predict and respond to changes in the environment by describing the transition probability of the system state, thereby improving the accuracy of path planning.
Assume that \(\left\{ {X\left( t \right), t \in T} \right\}\) is the random process of system operation. If for any positive integer n and \(t_{1} < \cdots < t_{n}\), \(P\left( {X\left( {t_{1} } \right) = x_{1} ,X\left( {t_{2} } \right) = x_{2} , \ldots ,X\left( {t_{n  1} } \right) = x_{n  1} } \right) > 0\), the conditional distribution satisfies:
Then we call \(\left\{ {X\left( t \right),t \in T} \right\}\) a Markov process.
In a nonlinear mechanism, different operating parameters of the robot will produce different operating states. These different operating states will generate different original signals, which then stimulate the target to enter the next cycle of judgment. Therefore, a state storage matrix is introduced to store these different operating states.
where \(PR\) is the state storage matrix. \(P_{xy}\) represents the state parameter probability.
Establish a nonlinear safety system:
where \(x\) is the parameter state of the nonlinear mechanism. \(f\left( x \right)\) is a local Lipschitz continuous vector field.
For a local Lipschitz continuous vector field, a continuous state function \(g\left( x \right)\) is given, and a restriction condition is added, namely the constant \(K \ge 0\). So that in the state set area of the function \(g\left( x \right)\), any two different parameters satisfy
We call this local Lipschitz optimal. Among them, the weight is \(w_{i}\). The bias is \(b_{i}\). The activation function is \(h\left( x \right)\). The system stability constant is \(C\). For any two parameters \(x_{a}\) and \(x_{b}\), there exists:
When and only when the vector derivative of the ith layer is greater than 1, it is the most applicable moment for the activation function, at which time the rotation degree of the microvector field turns to the local Lipschitz vector field. That is, a Lipschitz vector field belonging to \({\mathbb{R}}^{n}\) is formed.
Consider the multistate changes of state parameters. Therefore, a multidimensional highorder nonlinear safety system is established based on the above nonlinear basic expression:
where \(x\) is the realtime parameter state. \(y\) is the predicted parameter state. \(n\) is the highdimensional state. \(f\left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)\) is a multidimensional local Lipschitz continuous vector field.
Establish a safety set for a multidimensional highorder nonlinear safety system:
where \(S\) is the safety set. \(x\) and \(y\) are parameter states. \({\mathbb{R}}^{n}\) is the nth dimension based on the continuous Lipschitz vector field. \(g\left( x \right)\) is the state function. \(h\left( x \right)\) is the activation function.
Introduce the safety boundary curve. It will establish different coordinate base points as the realtime position of the robot changes. The boundary point is close to the critical point of the safety set.
where \(\ell\) represents the safety boundary curve. \(R\) represents the realtime safety radius. \(x_{t}\) and \(y_{t}\) are the predicted coordinates of the robot in the next cycle. \(x_{robot}\) and \(y_{robot}\) are the realtime coordinates of the robot. \(k_{t}\) is the equilibrium error constant. It is positive when the predicted position of the robot is close to the target position and negative when it deviates from the target position.
Remark1
The safety limit curve is a critical function in a multidimensional highorder nonlinear system. It has the following properties.

(1)
When the state parameter is within the critical function, the system is in an absolutely stable state.

(2)
When the state parameter falls between the critical function and the safety set, the system is in a relatively stable state.

(3)
When the state parameter falls outside the safety set, the system is in an unstable state.
When the safety set is in normal conditions, the parameter probability of the Markov transition matrix changes with time. Therefore, we establish a highorder Lipschitz vector field dynamic system based on the Markov model. This system can switch critical functions and maintain the stability of highorder nonlinear systems.
where \(x = \left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)^{T} \epsilon R^{n}\). The physical meaning of each component represents a different action state. \(f\left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)\) is a multidimensional local Lipschitz continuous vector field. \({\text{y}}\) is the output of the system. \({\text{m}}\) represents the system observation coefficient. When \(p > 1\) is an odd number, C is the stability constant of Lipschitz. \(f\left( {x_{n  1} } \right)\) is a fixed value when the Lipschitz rotation is \(\frac{\pi }{2}\).
For \(x = \left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)^{T}\), set \(x_{1}\) to be the initial state of the robot. The time interval is \(0 \to t_{1}\). The terminal time of the time interval is the real time of state \(x_{1}\). Because the complexity of each exploration cycle is different, there are three different situations: \(t_{1} > t_{2} ,t_{1} < t_{2}\) and \(t_{1} = t_{2}\). According to the principle of time accumulation, \(t_{x1} < t_{x1} < \cdots < t_{xn}\) can be obtained. When the robot does not collide and arrives at the destination safely, the probability of each state must be greater than zero. Otherwise, the state of the system will be interrupted at a certain time node \(t_{xn}\). Therefore, the purpose of the highorder Lipschitz vector field dynamic system we established based on Markov is to make the system close to the ideal state, that is, \(P\left( {X\left( {t_{x1} } \right) = x_{1} , \ldots ,X\left( {t_{xn} } \right) = x_{n} } \right) > 0\). We can partition different time periods for processing. That is, "windowing". Given \(m = 0.2, p = 2, m = 0.6, p = 5\). The system stability constant is \(C = m/p\). \({\mathcal{H}}\) represents the general term for different state parameters. The highorder Lipschitz vector field dynamic system can be expressed as:
The robot we consider is a 6DOF robot planning test in threedimensional space. To adapt to a more general DDR, we define the following variables. \(x_{ddr} ,y_{ddr}\) and \(z_{ddr}\) are the current position coordinates of the robot. \(\theta_{ddr}\) is the current direction angle of the robot (the angle relative to the global coordinate system). \(v_{r}\) and \(v_{l}\) are the linear velocities of the right wheel and the left wheel respectively. \(\omega_{rl}\) is the angular velocity. \(q_{ri}\) is the angle of the ith joint. \(I_{ri}\) is the moment of inertia. \(\tau_{ri}\) is the joint torque. \(C_{rij}\) is the Coriolis torque matrix. \(G_{ri}\) is the gravity term. Then the restriction term in the 6DOF joint is expressed as \(\dot{q}_{ri} = \frac{1}{{I_{ri} }}\left( {\tau_{ri}  \mathop \sum \limits_{j = 1}^{6} C_{rij} \dot{q}_{rj}  G_{ri} } \right) {\text{for i}} = 1,2,...,6\). \(L\) is the distance between the two wheels. Then the continuous kinematic equation is discretized with a step size of \(\Delta t\). The discretized kinematic equation is
where \(x_{ddr}^{k} ,y_{ddr}^{k} ,z_{ddr}^{k}\) and \(\theta_{ddr}^{k}\) are the states of the robot at the kth step. \(x_{ddr}^{k + 1} ,y_{ddr}^{k + 1} ,z_{ddr}^{k + 1}\) and \(\theta_{ddr}^{k + 1}\) are the states of the robot at the k + 1th step. System (11) is extended to three dimensions. Set the path point \(P_{0} ,P_{1} , \ldots ,P_{n}\). The coordinates of each path point are represented by \(\left( {x_{i} ,y_{i} ,z_{i} } \right)\). \(P_{0} ,P_{1} , \ldots ,P_{n}\) are equivalent to extended nodes in the following text. The smooth curve of the trajectory in path planning can be represented as
where \(b_{i} \left( t \right)\) is the spline basis function. Let \(q = \left[ {q_{r1} ,q_{r2} , \ldots ,q_{r6} } \right]^{T}\). \(\tau = \left[ {\tau_{r1} ,\tau_{r2} , \ldots ,\tau_{r6} } \right]^{T}\). Then the Lagrangian of 6 degrees of freedom gives \({\mathbf{\mathcal{X}}}\left( q \right)\user2{\ddot{q}} + C_{rij} \left( {q,\dot{q}} \right) + G_{ri} \left( q \right) = \tau + h\left( {\dot{q}} \right)\). \(h\left( {\dot{q}} \right)\) is the activation function vector about \(\dot{q}\). Then under the action of the highorder Lipschitz vector field dynamic system, the 6degreeoffreedom dynamic model can be expressed as \(\mathop \sum \limits_{i = 1}^{6} J_{ri}^{T} \left( q \right)I_{ri} J_{ri} \left( q \right)\user2{\ddot{q}} + \mathop \sum \limits_{{{\varvec{i}} = 1}}^{6} \mathop \sum \limits_{{{\varvec{j}} = 1}}^{6} \left( {\frac{1}{2}\left( {\frac{{\partial {\varvec{m}}_{{{\varvec{rij}}}} }}{{\partial {\varvec{q}}_{{\varvec{k}}} }} + \frac{{\partial {\varvec{m}}_{{{\varvec{rik}}}} }}{{\partial {\varvec{q}}_{{{\varvec{rj}}}} }}  \frac{{\partial {\varvec{m}}_{{{\varvec{rjk}}}} }}{{\partial {\varvec{q}}_{{{\varvec{ri}}}} }}} \right)\dot{\user2{q}}_{{{\varvec{rj}}}} } \right)\dot{q} + \mathop \sum \limits_{i = 1}^{6} J_{ri}^{T} \left( q \right)h\left( {\dot{q}} \right) + F_{c}\). \(J_{ri} \left( q \right)\) is the Jacobian matrix of the ith joint. \(\frac{{\partial {\varvec{m}}_{{{\varvec{rij}}}} }}{{\partial {\varvec{q}}_{{\varvec{k}}} }}\) is the partial derivative of the inertia matrix element with respect to the joint angle. \(F_{c}\) is the influence of the joint friction force on the joint motion of the right and left wheels. The system state equation of DDR is expressed as
Assume that the system state variable is \({\rm X}_{ddr} = \left[ {x_{ddr}^{k} ,y_{ddr}^{k} ,z_{ddr}^{k} ,\theta_{ddr} ,x_{1} ,x_{2} , \ldots ,x_{n} } \right]\). According to the highorder Lipschitz vector field, we get:
Remark 2
According to the Markov characteristic, when the steady state of the system takes different values, the smoothness of the highorder nonlinear system is different. The parameter solution of each state is unique and asymptotically stable everywhere. When the steady state value of the system is outside the safe set, the stability constant expression is meaningless.
Complex barrier function
The extended terms generated by the “windowing” processing of the planning time will cause "redundant terms" in the dynamic system. Therefore, it is very important to establish a complex barrier function in the highorder Lipschitz vector field dynamic system. The main design idea of the complex barrier function is to effectively avoid repeated motion and speed up the system operation mechanism by repeatedly setting the barrier function.
Consider a multidimensional highorder nonlinear safety system, namely \(\dot{x}\dot{y}\dot{n} = f\left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)\). Based on the above, the barrier function is constructed. The basic idea of the barrier function is to construct a scalar function \(h\left( x \right)\) based on the structural information and safety constraints of the system. To detect the invariance of the safety set. Now for the highdimensional state, let \(G\left( x \right) =  h\left( x \right)\). If:
where \(L_{f}\) is the Lyapunov function, which is a scalar function used to determine the stability of the system. The safety set at this time is a stable and unchanged state set. \(L_{f} \cdot G\left( x \right)\) represents the Lie derivative of \(G\left( x \right)\) along the vector field \(L_{f}\). The scalar function \(h\left( x \right)\) is the barrier function that needs to be constructed in the system. It plays a vital role in the safe operation of the system. The complex barrier function is denoted as \(ch\left( x \right)\). It satisfies condition (15).
Figure 2 is the calibration principle of the complex obstacle function. The yellow circle in the figure represents an obstacle. It can be represented as a circle with a center point of \(O_{i}\) and a radius of \(r_{i}\). \(O_{i} = \{ point \in {\mathbb{R}}^{n} \left \right\left {point  O_{i} } \right \le r_{i} \}\). \(point\) is an arbitrary point and \(\left {\left \cdot \right} \right\) is the Euclidean distance. Safe Distance refers to the minimum distance that the robot must maintain from the obstacle during path planning. This minimum allowed distance can be represented as \(d_{safe}\). It is a preset constant. To ensure the reasonable size of \(d_{safe}\), we define an extended obstacle area \(O_{i}^{\prime }\). That is, the blue area. \(O_{i}^{\prime } = \{ point \in {\mathbb{R}}^{n} \left \right\left {point  O_{i} } \right \le r_{i} + d_{safe} \}\). Then the complex obstacle function can be represented as \(ch\left( x \right) = \left\{ {\begin{array}{*{20}l} {\infty , {\text{if}} point \in O_{i}^{\prime } {\text{for any}} i} \hfill \\ \hfill \\ {0, {\text{otherwise}}} \hfill \\ \end{array} } \right.\). When the robot enters the extended obstacle area, a very large value (\(\infty\)) will be given, forcing the path planning algorithm to avoid these areas. The path pointed by the red arrow is the Infeasible path. It means that the robot will enter the safe distance range of the obstacle if it takes this path, resulting in a collision. If the original position of the robot is marked as an obstacle function, so that \(L_{f} \cdot G\left( x \right) > 0\), then the original position is an unsafe area. In the next cycle, the robot will automatically avoid this position to avoid repeated exploration of the original position. This reduces the data complexity of the next cycle.
Theorem 1
For system (6), different states start from the safe set S. After a series of planning tasks, they always stay inside the safe set S. Then the highorder system is safe.
In order to expand the complex obstacle function to highdimensional space, we combine the 6DOF robot dynamics equation to illustrate the applicability of the complex obstacle function in highdimensional space. Given \({\mathbf{\mathcal{X}}}\left( q \right)\user2{\ddot{q}} + C_{rij} \left( {q,\dot{q}} \right) + G_{ri} \left( q \right) = \tau + h\left( {\dot{q}} \right)\). For the highdimensional state \(q\), introduce the complex obstacle function \(ch_{q}^{ob} \left( q \right) = \mathop \sum \limits_{i = 1}^{n} \left( {\frac{{d^{2}  \left {\left {q  q_{i} } \right} \right^{2} }}{{d^{2} }}} \right)\). \(q_{i}\) is the position of the ith obstacle, and \(d\) is the safe distance between the robot and the obstacle. In order to ensure the effectiveness of the complex obstacle function in highdimensional space, we incorporate it into the dynamics equation and define the penalty function \({\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right)\) to deal with the obstacle problem in highdimensional space.
The complex obstacle function is introduced into the dynamic equation of the 6DOF robot to obtain \({\mathbf{\mathcal{X}}}\left( q \right)\user2{\ddot{q}} + C_{rij} \left( {q,\dot{q}} \right)\dot{q} + G_{ri} \left( q \right) + h\left( {\dot{q}} \right) + {\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right)I_{ri} = \tau\).
\({\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right)\) is used to calibrate the complex obstacle function. \({\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right) = {\mathbf{\mathcal{P}}}^{{\mathbf{k}}} ch_{q}^{ob} \left( q \right)I_{ri}\). \({\mathbf{\mathcal{P}}}^{{\mathbf{k}}}\) is the penalty coefficient. Define the position and orientation angle of each joint in the system state vector \(q\). \(q = \left[ {q_{r1} ,q_{r2} , \ldots ,q_{r6} } \right]^{T} \to \left[ {x_{ddr}^{k} ,y_{ddr}^{k} ,z_{ddr}^{k} ,\theta_{ddr}^{x} ,\theta_{ddr}^{y} ,\theta_{ddr}^{z} } \right]^{T}\). The complex obstacle function can be expressed as \(ch_{q}^{ob} \left( q \right) = \mathop \sum \limits_{i = 1}^{n} \left( {\frac{{d^{2}  \left {\left {q  q_{i} } \right} \right^{2} }}{{d^{2} }}} \right)\). The working mechanism is as follows. \(ch_{q}^{ob} \left( q \right)\) will change dynamically according to the distance between the robot and the obstacle. When \(d\) decreases, \(\left {\left {q  q_{i} } \right} \right\) decreases, causing \(ch_{q}^{ob} \left( q \right)\) to increase. Assuming that the position of the robot at a certain moment is \(q_{t}\) and the position of the obstacle is \(q_{s}\), it can be seen from \(ch_{q}^{ob} \left( q \right) = \mathop \sum \limits_{i = 1}^{n} \left( {\frac{{d^{2}  \left {\left {q  q_{i} } \right} \right^{2} }}{{d^{2} }}} \right)\) that \(\left {\left {q_{t}  q_{s} } \right} \right \approx 0\). Causes \(ch_{q}^{ob} \left( q \right)\) to increase, thereby \({\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right)\) increases. Through this mechanism, we can ensure that the robot's path planning in highdimensional space always avoids obstacles, thereby maintaining the safety and stability of the system.
The Markov highorder Lipschitz vector field dynamic system can be expressed as
where \({\mathbf{\mathcal{P}}}\left( {ch_{q}^{ob} \left( q \right)} \right)\) is the penalty function. It completes the conversion of the next cycle by adding the penalty function of the path that has been taken in the previous cycle. In summary, we can establish a related model system. Figure 3 shows the nonlinear control model system established based on robot behavior planning.
The system is a robotbased behavior planning and decisionmaking model. It enables the robot to move and make decisions in a complex environment through a multidimensional highorder nonlinear system and a complex obstacle function. The system uses a Markov decision model, which can analyze and evaluate the robot’s state and environment, and then make the best decision based on the evaluation results. In addition, the system also takes the existence of obstacles into account. It uses an obstacle function to restrict the robot's movement trajectory, so that the path planning can always stay within the safe set, that is, satisfy Theorem 1.
Classification LSTM
Long ShortTerm Memory (LSTM)^{22} is a variant of Recurrent Neural Network (RNN) that is specifically designed to process and predict time series data. LSTM solves the gradient vanishing and gradient exploding problems of traditional RNN when processing long sequence data by introducing memory units and gating mechanisms. Therefore, LSTM performs well in processing longterm dependencies and complex nonlinear data. Its structural design diagram is shown in Fig. 4.
In Fig. 4, LSTM is mainly composed of input gate, forget gate, output gate and several cell states. There are four Sfunction units at the bottom. They control the conduction and cutoff of the gate by judging the data. The three on the right will pass through the transmission gate to determine whether the input is passed into the block. The second one on the left is the input gate. If the input is close to zero, it will not enter the next layer. The third one on the left is the forget gate. When the generated value is close to zero, the data in the block will be discarded. The rightmost is the output gate, which determines whether the gate in the block memory is output. The input gate calculates a value between 0 and 1 based on the current input and the hidden state of the previous moment through a sigmoid activation function, indicating which information needs to be retained. The forget gate determines which information in the memory cell needs to be forgotten. It also calculates a value through a sigmoid activation function to filter the memory cell state of the previous moment. In this way, LSTM can discard historical information that is no longer relevant and only retain important information related to the current path planning. The output gate controls the output information of the memory cell. It determines which information needs to be output to the hidden state of the next moment. However, a large amount of data is generated in each time period of multidimensional highorder system path planning, which requires traditional LSTM to store and update a large number of weights and intermediate states. Therefore, we designed a new LSTM (Classification LSTM) based on the highorder Lipschitz vector field. Figures 5 and 6 are the structural design diagram and application network model diagram of Classification LSTM respectively.
As shown in Fig. 5, the Classification LSTM is mainly composed of LSTM and Gaussian mixture model (GMM). The output \(h_{t}\) of LSTM is passed to the Gaussian mixture classification module. \(h_{t}\) is generated by the memory cell state and output gate control. It is output by setting the threshold \(\theta\) and the activation function \(y_{i}\). Then, multiple Gaussian distributions \(\left( {\omega_{1} ,\omega_{2} , \ldots ,\omega_{n} } \right)\) are generated after passing through the Gaussian mixture classification module, and each distribution corresponds to a category. This model combining LSTM and GMM can make full use of the time series modeling ability of LSTM and the classification ability of GMM, and is suitable for complex time series classification tasks. In path planning, data of different cycles have complex time series associations. Through Classification LSTM, these time dependencies can be better modeled. GMM can perform more refined classification based on the LSTM output. By generating multiple Gaussian distributions, it can more accurately correspond to different path planning decision categories.
In Fig. 6, the model starts from the starting point, initializes the necessary parameters and variables, and the data flow is transmitted through information simplification and information channels. In the input gate stage, LSTM decides which information to retain based on the current input and the hidden state of the previous moment, while the output gate controls the information to be transmitted to the next cycle. The information simplification module preprocesses the input data, removes redundant information, and improves processing efficiency. The model obtains the necessary cycle data from the external environment and trains the Classification LSTM through the training sample pool and training data. The first Classification LSTM module uses the time series modeling capability of LSTM and combines GMM for preliminary classification to generate multiple Gaussian distributions, each of which corresponds to a possible path planning decision. The second Classification LSTM module further processes and classifies the data to improve the accuracy and reliability of the decision.
Remark3
Set the receiving gate of the data to a fourtuple \(S_{p} = \left( {\rho ,\alpha ,\mu ,\tau } \right)\). The data accepted by each pole gate can be equivalent to a small Datebase. Set input to the incoming block, and the output of the input gate is not zero. The loss function is represented by Loss.
where \(\rho\) and \(\alpha\) are the input of the initial layer transfer gate respectively. \(Loss\left( {\rho /\alpha } \right)\) is the loss function of the initial layer. \(Qs_{p1} \left( {\rho ,\alpha } \right)\) represents the value tuple function after passing through the transfer gate. \(y_{i}\) is the initial value of the sigmoid function. Here, an initial value of the activation function is given to control data loss. \(f\left( {x_{i} , \theta } \right)\) represents the predicted data information in the unknown detection process. \(x_{i}\) is the true value of the data from the ith layer transfer gate. \(\theta\) represents the threshold. The periodic data meets the Classification LSTM output terminal
Based on the above observations, with the help of LSTM to capture time dependency and GMM’s multimodal classification capabilities, the model can process periodic data in real time and adapt to dynamically changing environments, thereby generating more optimized paths. In the experimental section on path planning quality later, the reduction in the number of scalable nodes is closely related to Classification LSTM.
Deployment method of visual sensors in a limited area
Path marking backtracking strategy and dead zone path simplification are two important strategies of the deployment method.
Obtaining the information relationship between the robot system and the target position is the premise of studying the path marking backtracking strategy. The information relationship diagram is shown in Fig. 7.
The excitation function \(p\left( {x, y} \right)\) and the inhibition function \(q\left( {x, y} \right)\) are the key to obtaining information relationships. The calculation formula is as follows:
where \(x_{object}\) and \(y_{object}\) are the realtime positions of the robot. \(x_{hinder}\) and \(y_{hinder}\) are the coordinate information of the obstacle. \(x_{goal}\) and \(y_{goal}\) are the target coordinate information. \(M_{robot}\) is the mass of the robot. When the robot is close to the target position, \(p\left( {x, y} \right)\) increases. When the robot is close to the obstacle position, \(q\left( {x, y} \right)\) increases. Therefore, with each step, the excitation function and the inhibition function will continue to change.
Theorem 2
Single plane exploration displacement data set . Initial value of trajectory array \(start\left[ m \right]\left[ n \right]\). If the incremental position of the next trajectory \({\text{star}}t^{ + } \left[ m \right]\left[ n \right]\) is \(\left( {x, y} \right)\), the exploration direction is \(u = u + du\). At the same time, the sensor deployment element rules are met, then \({\text{star}}t^{ + } \left[ {m_{x} } \right]\left[ {n_{y} } \right] = 1\), that is, the path marking is completed. \(start\left[ m \right]\left[ n \right] = {\text{star}}t^{ + } \left[ m \right]\left[ n \right]  {\text{star}}t^{ + } \left[ {m_{x} } \right]\left[ {n_{y} } \right]\). The initial position displacement data set becomes , and the iteration is repeated until \({\mathcal{M}} = \emptyset\).
Based on Theorem 2, the expansion function in RRT is modified and optimized according to (20) and (21), as shown in Table 1.
In Table 1, \(M\) is the path planning environment. \(V\) is the vertex set. \(E\) is the edge set. \(x_{init}\) is the starting position. \(x_{goal}\) is the target position. \(\Gamma\) is the path. \(i\) is the number of iterations. \(ch\left( x \right)\) is the complex obstacle function. In this process, any node \(x_{rand}\) is randomly emitted from \(x_{init}\), and its closest distance \(x_{near}\) is found. Then \(x_{new}\) is the \({\text{Stepsize}}\) from \(x_{near}\) to \(x_{rand}\). If the activation function \(p\left( {x_{new} } \right)\) of the new node is greater than the activation function \(p\left( {x_{i  1} } \right)\) of any previous node, then the new node is expanded into the set. \({\text{CollisionFree}}\left( {M^{\prime},{ }E_{i} } \right)\) is used to monitor obstacles in \({\text{Edge}}\left( {x_{new} , x_{near} , ch\left( x \right)} \right)\).
Completing the local full traversal of the current segment based on systems (20) and (21) and Theorem 2 is the key to optimizing the deployment. First, set the planning environment to be divided into several unit areas. The side length of the unit area is much smaller than the set sensor detectable range. The set number of sensors must be greater than the number of path marking positions. The set of visual sensors is represented by \(Sensor = \left\{ {Sr_{1} , Sr_{2} , \ldots , Sr_{n} } \right\}\). The deployment element of each sensor is represented by \(element = \left\{ {ep_{1} ,ep_{2} , \ldots ,ep_{n} } \right\}\). When the planned position satisfies the deployment element of a certain sensor, the path is marked. For each step forward, if Theorem 2 is satisfied, it is iterated and updated repeatedly until the optimal planning route is found. We can simulate this process through a simple node model.
The node model diagrams in Fig. 8a,b describe the implementation process of the path marking backtracking strategy. In (a), 14 nodes represent key positions in path planning, of which 11 blue nodes represent marked path positions and 3 yellow nodes represent path positions to be marked. Arrows represent the association between nodes. For example, node 1 points to node 4 and then to node 7, indicating the path direction of the robot. In (b), the color of some nodes changes, with nodes 1 and 4 turning red and node 7 turning green, indicating the application of the path marking backtracking strategy. At the same time, the path from 1 to 4 and then to 7 disappears, indicating that the path planning has been updated according to the data to avoid the inaccessible area.
Dead zone path simplification is to remove the "closed dead zone". It will further realize the partial path simplification after the path marking backtracking strategy. When there is no sensor deployment at the planned location, that is, there is no associated sensor deployment element, then the planned location is a dead zone location. The specific process is shown in Table 2.
Based on the above information, we assume that there are \(N_{s}\) visual sensors, and the position of each sensor \({\text{s}}_{{\text{i}}}\) is represented by \(\left( {{\text{x}}_{{{\text{s}}_{{\text{i}}} }} ,{\text{y}}_{{{\text{s}}_{{\text{i}}} }} ,{\text{z}}_{{{\text{s}}_{{\text{i}}} }} } \right)\). Its coverage is determined by the radius \(r_{i}\). Then the deployment set of sensors is represented by \({\mathcal{S}} = \left\{ {\left( {x_{{s_{i} }} ,y_{{s_{i} }} ,z_{{s_{i} }} } \right),\left( {x_{{s_{i} }} ,y_{{s_{i} }} ,z_{{s_{i} }} } \right), \ldots ,\left( {x_{{s_{i} }} ,y_{{s_{i} }} ,z_{{s_{i} }} } \right)} \right\}\). Assume that the map is divided into \(M \times N \times L\) volume units. Use a threedimensional array \({\mathbb{M}}\) to represent the map, and its element \(m_{i,j,k}\) represents the state at the volume unit \(\left( {x_{i} ,y_{j} ,z_{k} } \right)\). Then map information acquisition can be represented by \({\mathbb{M}}\).
where \(M,N\) and \(L\) are the resolutions of the map in the \(x,y\) and \(z\) directions respectively. The state here can be 1 for obstacles or 0 for no obstacles.
Define \(S\left( {x,y,z} \right)\) to represent the area covered by the sensor. For each volume unit \(\left( {x_{i} ,y_{j} ,z_{k} } \right)\) in the map, it is expressed as follows:
where \(x_{sl} ,y_{sl}\) and \(z_{sl}\) represent the coordinates of the lth sensor. \(r_{l}\) is the coverage radius of the lth sensor. \({\text{max}}\) represents the maximum value operator, which is used to determine whether the sensor covers the volume unit.
Remark4
The goal of sensor deployment is to cover as much area as possible on the map to obtain comprehensive environmental information. We achieve this goal by optimizing the location and coverage of sensors. Map information acquisition is to collect environmental data through sensors and convert it into map state. This process includes the following steps.

(1)
Sensors sense the surrounding environment and collect data at various locations.

(2)
The collected data needs to be processed and analyzed to extract useful information.

(3)
Based on the processed data, we can build a map to represent the state of the environment. The map can be twodimensional or threedimensional, where each cell or volume unit corresponds to an area in the environment. Set the threedimensional space to \({\mathbb{R}}^{3}\). The known map state matrix is \({\mathbb{M}}\). Then update the state matrix based on the results of sensor data processing. \({\mathbb{M}} = g\left( {D,m_{i,j,k}^{{\left( {t  1} \right)}} } \right)\). Among them, \(g\) is the state update function, and \(m_{i,j,k}^{{\left( {t  1} \right)}}\) is the map state in the previous cycle. \(D\) is the data processing result of the sensor.

(4)
The state of the map needs to be updated over time to reflect the dynamic changes of the environment. Sensors continuously sense the environment and update the map state to ensure the timeliness and accuracy of map information. During the update cycle with a time step of \(\Delta t\), the dynamic changes of the environment can be modeled by the state transition function. The state transition function is \({\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right) = m_{i,j,k}^{\left( t \right)}\). \({\mathcal{L}}\) combines the state of the previous time step and the sensor data of the current time step to update the current state. Then the dynamic update of the map state can be expressed as \(m_{i,j,k}^{\left( t \right)} = {\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,S\left( {x_{{s_{i} }} ,y_{{s_{j} }} ,z_{{s_{k} }} } \right),D^{\left( t \right)} } \right)\). It provides a detailed description of the sensor coverage area and the dynamic update of the map state. Through these descriptions, we can accurately deploy sensors and acquire and update map information.
To better describe the deployment process, we simulated the environment information with a map size of \(200 \times 200\). The resolution was 1.40 dynamically changing obstacles were simulated 6 times, as shown in Fig. 9. The starting point and the target point were set at (20, 20) and (180, 180), respectively, represented by blue and red dots. 30 visual sensors were deployed, and the coverage radius of the sensor was 20. If a point in the environment is covered by a sensor, the point is marked as a backtracking point. These points are backtracked in the final path and are not actually walked through.
Experiment analysis
To verify the feasibility of the theory, we verify it from three aspects: the quality of path planning in a twodimensional static environment, the quality of path planning in a threedimensional dynamic environment, and the effect of data prediction. The methods we compare include Traditional RRT, Extended_RRT, Goalbias RRT, RRT*, Dynamic weighted RRT algorithm (DWRRT)^{23} and BPRRT*^{24}.
Path planning quality in 2D static environment
We compared IRPA (RRT planning for the PTM algorithm) with Extended_RRT, Goalbias RRT, and RRT*. Planning time, response cycle, and number of extended nodes were used as evaluation indicators. The planning environment size was 50 × 30. Static planning was adopted. The coverage radius of the sensor was set to 5, and each algorithm was run 10 times. PR is used to store the coverage information of the sensor position in each run. That is, \(PR = \left( {\begin{array}{*{20}c} {Sensor_{11} } & \cdots & {Sensor_{1n} } \\ \vdots & \ddots & \vdots \\ {Sensor_{n1} } & \cdots & {Sensor_{nn} } \\ \end{array} } \right)\). Then the update of \(Sensor_{ij}\) can be regarded as the weight change of the local Lipschitz. \(w_{i} \leftarrow Sensor_{ij}\). The planning time can be directly obtained through software testing. The test method for the number of extended nodes and response cycle is shown as follows.

(1)
Number of extended nodes. During each run, when the algorithm generates a path from the starting point to the target point, each new node generated is counted once. The number of extended nodes is the number of all nodes generated by the algorithm before finding a valid path. When the algorithm termination condition is reached, the total number of extended nodes is recorded as \(N_{nodes}\). Given \(Output Date = \mathop \sum \limits_{i = 1}^{4} (y_{i} \left( {Qs_{p1} \left( {\rho ,\alpha } \right)} \right)  \theta )\) and \(w_{i} \leftarrow Sensor_{ij}\). Then, when the algorithm is not completely terminated, the generation of new nodes causes the value tuple function of \(Qs_{p1} \left( {\rho ,\alpha } \right)\) to be lost after passing through the transfer gate. That is, \(Qs_{p1} \left( {\rho ,\alpha } \right)_{t + 1} < Qs_{p1} \left( {\rho ,\alpha } \right)_{t}\). Then the output gate will generate fewer nodes. That is, \(N_{nodes} _{t + 1} < N_{nodes} _{t}\). After repeated iterations, take the average expansion node of 10 times. \(\overline{{N_{nodes} }} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} N_{i}\).

(2)
During each run, record the time interval of each step as \(\Delta {\text{t}}\). The response cycle is the time required for the algorithm to generate the next state \(\left( {P_{state} _{t} } \right)\) from the current state \(\left( {P_{state} _{t + 1} } \right)\). In each run, calculate the average of all time intervals. That is, \(T_{cycle}\). When the algorithm termination condition is reached, record the average response cycle. Calculate the average response cycle of 10 times. \(\overline{{T_{cycle} }} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} T_{cycle}^{\left( i \right)}\). \(T_{cycle} = \frac{1}{{N_{nodes} }}\mathop \sum \limits_{i = 1}^{{N_{nodes} }} \Delta t_{i}\). \(T_{cycle}^{\left( i \right)}\) is the average response cycle of the ith run. \({\text{n}} = 10\).
The comparison results of path planning quality are shown in Fig. 10.
In Fig. 10, the blue and green blocks are the starting and ending points. The red curve is the planned route. The green curve is the extended node. The gray part is the obstacle position. (a) It uses the information of the initial path instead of completely randomly scattering points. It has disadvantages such as excessive generation of branches. (b) It has both excessive branches and drawbacks such as falling into local optimal solutions. However, compared with (a), the search time is shorter and the number of extended nodes is less. (c) Compared with the former, the planned path is significantly shortened, but the response cycle is long and the number of extended nodes is large. In addition, it also has problems such as overoptimization and parameter dependence. (d) While ensuring the completeness of the search path, it achieves the effect of fewer extended nodes and shorter search time. The parameter comparison of the four methods is shown in Table 3.
Table 3 shows that the planning time, response cycle and number of expanded nodes generated by IPRA are better than those of the other methods. Among them, the planning time is reduced by 76.4% of the original. The response time is reduced by 82.3% of the original. The number of expandable nodes is reduced by 86.7% of the original.
As known from Remark 4, the state of the map needs to be updated over time. While ensuring the accuracy of the map information, the state transition function is modeled with a time step of \(\Delta t\). The state transition function is expressed as \({\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right) = m_{i,j,k}^{\left( t \right)}\). Then the number of times \({\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right)\) is updated with \(\Delta t\) represents the number of iterations of the algorithm to search for the optimal path. In continuous iterations, the iteration error will change with the change of \({\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right)\). The iteration error refers to the gap between the generated path and the optimal path during the path planning process. It is similar to the discrete error in^{12}. This error can be expressed as \(Path_{er} = \left {{\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right)_{\Delta t}  {\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right)_{{\Delta t^{  } }} } \right\) based on \({\mathcal{L}}\left( {m_{i,j,k}^{{\left( {t  1} \right)}} ,D^{\left( t \right)} } \right) = m_{i,j,k}^{\left( t \right)}\). \(\Delta t^{  }\) is the previous search step relative to \(\Delta t\). In IRPA, the initial step size is set to 1 and the error threshold is set to 0.01. The target bias probability of traditional RRT is set to 0.1. Initialize the highorder Lipschitz vector field parameters and the Classification LSTM model. In each iteration, the path is adjusted according to the error value, and the Classification LSTM is used to filter the data. When the error is lower than the threshold, the search path is terminated and the search iteration is stopped. Figure 11 shows this iterative process.
In Fig. 11, the number of iterations of IRPA is about 2400, and the error at this time is less than 0.01. There was also a situation where the error was less than 0.01 before 2400, but the adjustment of the nodes proved that it was not the optimal path at this time. (a) The figure shows staggered spike changes. This shows that in each iteration, there is a large uncertainty and fluctuation in the position of the new node generated by the algorithm. This is because IRPA will continuously optimize and adjust during the path generation process, resulting in a large change in the error. The traditional algorithm still did not converge after nearly 20,000 iterations. Because the traditional RRT uses a fixed step size and rules in the path generation process, the error changes are relatively stable. Therefore, (b) shows a smooth blue line.
Path planning quality in 3D dynamic environments
In order to make the path planning more applicable to general Differential Drive Robots (DDR), we simulated and tested the method through a threedimensional scene. The map environment also changed from static to dynamic. This dynamic change simulates the actual environment more realistically. The map size is set to \(20 \times 20 \times 10\). The number of sensors is set to 10. The algorithm step size in the RRT expansion function is set to 0.2. The backtracking step size is 0.2. In the LSTM output, the output state in each step is represented as \(h_{t} = \alpha \left( {W_{h} \cdot \left[ {h_{t  1} ,x_{t} } \right] + \theta } \right)\). \(x_{t}\) is the input data, and \(W_{h}\) is the weight. \(h_{t  1}\) is the output state of the previous step. Then the parameters in \(Output Date = \mathop \sum \limits_{i = 1}^{4} (y_{i} \left( {Qs_{p1} \left( {\rho ,\alpha } \right)} \right)  \theta )\) change due to the update of \(h_{t}\). By adjusting \(W_{h}\), \(h_{t}\) and \(Output Date\) change until all the initial position data sets are empty. That is, \({\mathcal{M}} = \emptyset\). At the same time, \(Qs_{p1} \left( {\rho ,\alpha } \right)_{t + 1} < Qs_{p1} \left( {\rho ,\alpha } \right)_{t}\). The number of node expansions that occur each time \(h_{t}\) is adjusted is different. Figure 12 simulates the path planning test in a threedimensional dynamic environment.
Figure 12 simulates two debugging of \(W_{h}\). They are denoted as \(W_{h1}\) and \(W_{h2}\) respectively. The number of nodes expanded in \(W_{h1}\) is 569, and the number of nodes expanded in \(W_{h2}\) is 272. The environment is dynamically changed based on the debugging parameters of \(W_{h1}\) and \(W_{h2}\). In \(W_{h1}\), the number of expanded nodes is close to Extended_RRT, and \({\mathcal{M}}\) can never search for empty data, and the loop continues. Therefore, the planned path mistakenly regards obstacles as feasible paths, resulting in collisions. In \(W_{h2}\), we did an interesting test. \(Qs_{p1} \left( {\rho ,\alpha } \right)_{t + 1} < Qs_{p1} \left( {\rho ,\alpha } \right)_{t}\) and \(Qs_{p1} \left( {\rho ,\alpha } \right)_{t + 1} > Qs_{p1} \left( {\rho ,\alpha } \right)_{t}\) were set respectively. The optimal design path and the collision path were obtained. It can be seen that the initial position data is crucial for the search conditions of the expanded nodes in a threedimensional state. Figure 13 tests this dynamically changing environment.
In Fig. 13, A is a laser radar, which is used to scan environmental information. B is a gyroscope, which is used to determine the angle of the avoidance system, such as the deflection angle, pitch angle and yaw angle. C is a router, which is used for remote debugging. D is a CPU, which is used to receive intelligent input algorithms and is the core of the avoidance system. E is a singlechip microcomputer, which can control the drive of the entire system as a control system.
Data prediction effect
In path planning, the map data in each time step must be predicted to determine whether it is the optimal node or obstacle. Therefore, the data prediction effect is particularly important. We compared IRPA with DWRRT and BPRRT* for data prediction. DWRRT^{23} is an improved RRT algorithm based on dynamic weighted maps recently proposed by Zhao and Wu. This algorithm can predict the path where collision may occur in advance when the search path is complete, effectively improving the efficiency of path planning. BPRRT*^{24} is a path planning algorithm based on BP neural network and improved rapidly expanding random tree proposed by Gao and Yuan et al. However, due to the discretization of obstacles and distance weight functions, the stability of planning is limited to a certain extent.
In this experiment, we take the difference between the actual and observed node expansion in each time step as the ideal data. When the predicted value is higher than the actual observed value, the prediction error is positive, that is, \(\left {Pre_{i}  Act_{i} } \right > 0\); conversely, when the predicted value is lower than the actual observed value, the prediction error is negative, that is, \(\left {Pre_{i}  Act_{i} } \right < 0\). In Markov’s nonlinear analysis parameters, \(\Delta t = 0.1s\). The number of layers of the LSTM network in the Classification LSTM is 4. Each layer contains 100 neurons. The training data contains 100, 150 and 200 data samples respectively. The characteristic dimension of the input data is 10. The backtracking step is 0.2. Then each data sample is used as the action state of the system (14). That is, \(x = \left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right)^{T} \epsilon R^{n} \to x = \left( {date_{d1} ,date_{d2} , \ldots ,date_{dn} } \right)^{T} \epsilon R^{n}\). \(f\left( {x_{1} ,x_{2} , \ldots ,x_{n} } \right) \to f\left( {date_{d1} ,date_{d2} , \ldots ,date_{dn} } \right)\) is a multidimensional local Lipschitz continuous vector field. The weights assigned to each layer of LSTM are set to \(\omega_{lstm}^{1} = 0.4,\omega_{lstm}^{2} = 0.2,\omega_{lstm}^{3} = 0.2,\omega_{lstm}^{4} = 0.2\). \(p\left( {x_{new} } \right)\) in the prediction always satisfies \(p\left( {x_{new} } \right) > \forall p\left( {x_{i  1} } \right)\). Regardless of \(\left {Pre_{i}  Act_{i} } \right > 0\) or \(\left {Pre_{i}  Act_{i} } \right < 0\), we construct an evaluation function about \(p\left( {x_{new} } \right)\).
where \(L_{path}\) is the path length. \(C_{collsions}\) is the number of extension points collected by \(CollisionFree\left( {M^{\prime } ,{ }E_{i} } \right)\). \(T_{exec}\) is the execution time of each layer of LSTM. \(E_{pred}\) is the prediction error. \({\text{n}} = 100,150,200\). It is ensured that \(CollisionFree\left( {M^{\prime } ,{ }E_{i} } \right)\) always has a quantitative advantage in any prediction. That is, \(Num\left( {CollisionFree\left( {M^{\prime } ,{ }E_{i} } \right)} \right) > Num\left( {Edge\left( {x_{new} , x_{near} , ch\left( x \right)} \right)} \right)\).
In Fig. 14, IRPA has good curve fit and difference stability with the ideal search data, indicating that its prediction error is small. Although DWRRT and BPRRT* also follow the ideal data, they fluctuate greatly, indicating that the prediction error is relatively high. In the iteration of 100 data samples, the prediction error of IRPA is significantly lower than that of DWRRT and BPRRT*. In the iteration of 150 and 200 data samples, IRPA still maintains high accuracy with small error fluctuations.
Conclusion
This paper studies an improved RRT behavior planning method for robots based on the PTM algorithm. A new solution is proposed for the problem of unstable path quality in multidimensional highorder nonlinear systems. We designed a highorder Lipschitz vector field dynamic system, a Classification LSTM, and a method for limited area sensor deployment. The highorder Lipschitz vector field dynamic system can adapt to the highorder dynamic characteristics of multidegreeoffreedom robots and demonstrates excellent accuracy in path planning. The Classification LSTM ensures the global stability of data in path design by screening the training data model. The path marking backtracking strategy and dead zone path simplification method significantly improve the efficiency of path planning by deploying visual sensors in a limited area. This method effectively covers key areas in the environment and achieves better path planning results. Finally, this paper compares IRPA with other algorithms through simulation experiments in twodimensional and threedimensional environments to verify the superiority of IRPA.
Although the limited area sensor deployment strategy shows significant effects on path marking backtracking and dead zone path simplification, its effectiveness is limited in closed environments, such as the path design inside the box of an automated loading and unloading truck^{25}. In the future, we will strengthen the design of sensor deployment in more complex environments, such as a sensor array data monitoring method based on DLM Mask RCNN that we recently studied. This will provide new ideas and solutions for path search.
Data availability
The datasets generated and/or analysed during the current study are available in the Zenodo repository, https://zenodo.org/records/12730125.
Abbreviations
 PTM:

Path tracking method
 RRT:

Rapidlyexploring random tree
 LSTM:

Long shortterm memory
References
Yao, Z. et al. ReinforcedRimJump: Tangentbased shortestpath planning for twodimensional maps. IEEE Trans. Industr. Inf. 16(2), 949–958 (2019).
Jan, G. E., Sun, C. C., Tsai, W. C. & Lin, T. H. An O(n log n) shortest path algorithm based on delaunay triangulation. IEEE/ASME Trans. Mech. 19(2), 660–666 (2013).
Mayya, S., Pierpaoli, P., Nair, G. & Egerstedt, M. Localization in densely packed swarms using interrobot collisions as a sensing modality. IEEE Trans. Rob. 35(1), 21–34 (2018).
Li, B. & Chen, B. An adaptive rapidlyexploring random tree. IEEE/CAA J. Autom. Sin. 9(2), 283–294 (2021).
Moon, C. & Chung, W. Kinodynamic planner dualtree RRT (DTRRT) for twowheeled mobile robots using the rapidly exploring random tree. IEEE Trans. Industr. Electron. 62(2), 1080–1090 (2014).
Saccuti, A., Monica, R. & Aleotti, J. PROTAMPRRT: A probabilistic integrated task and motion planner based on RRT. IEEE Robot. Autom. Lett. 8(12), 8398–8405 (2023).
Lim, W. et al. Hybrid trajectory planning for autonomous driving in onroad dynamic scenarios. IEEE Trans. Intell. Transp. Syst. 22(1), 341–355 (2019).
Zhang, X., Zhang, R. & Wang, X. Visual slam mapping based on yolov5 in dynamic scenes. Appl. Sci. 12(22), 11548 (2022).
Rodriguez, S., Tang, X. & Lien, J. M., et al. An obstaclebased rapidlyexploring random tree. In Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006. 895–900 (IEEE, 2006).
Lan, W. et al. Improved RRT algorithms to solve path planning of multiglider in timevarying ocean currents. IEEE Access 9, 158098–158115 (2021).
An, B., Kim, J. & Park, F. C. An adaptive stepsize RRT planning algorithm for openchain robots. IEEE Robot. Autom. Lett. 3(1), 312–319 (2017).
Hoy, M., Matveev, A. S. & Savkin, A. V. Algorithms for collisionfree navigation of mobile robots in complex cluttered environments: Asurvey. Robotica 33(3), 463–497 (2015).
Bianco, C. G. L. Minimumjerk velocity planning for mobile robot applications. IEEE Trans. Robot. 29(5), 1317–1326 (2013).
Zhou, B., Pan, J., Gao, F. & Shen, S. Raptor: Robust and perceptionaware trajectory replanning for quadrotor fast flight. IEEE Trans. Robot. 37(6), 1992–2009 (2021).
Nilles, Q., Ren, Y., Becerra, I. & LaValle, S. M. A visibilitybased approach to computing nondeterministic bouncing strategies. Int. J. Robot. Res. 40(10–11), 1196–1211 (2021).
Kumar, R. & Moore, J. B. Convergence of adaptive minimum variance algorithms via weighting coefficient selection. IEEE Trans. Autom. Control 27(1), 146–153 (1982).
Khalil, H. K. Nonlinear Systems 3rd edn. (PrenticeHall, 2002).
Karafyllis, I. & Krstic, M. Adaptive certaintyequivalence control with regulationtriggered finitetime leastsquares identification. IEEE Trans. Autom. Control 63(10), 3261–3275 (2018).
Jiang, C. et al. A fast and highperformance object proposal method for vision sensors: Application to object detection. IEEE Sens. J. 22(10), 9543–9557 (2022).
Sheffield, S. & Smart, C. K. Vectorvalued optimal Lipschitz extensions. Commun. Pure Appl. Math. 65(1), 128–154 (2012).
Mor, B., Garhwal, S. & Kumar, A. A systematic review of hidden Markov models and their applications. Arch. Comput. Methods Eng. 28, 1429–1448 (2021).
Lindemann, B. et al. A survey on anomaly detection for technical systems using LSTM networks. Comput. Ind. 131, 103498 (2021).
Zhao, B. et al. Research on hybrid navigation algorithm and multiobjective cooperative planning method for electric inspection robot. Energy Rep. 9, 805–813 (2023).
Gao, Q. et al. Path planning algorithm of robot arm based on improved RRT* and BP neural network algorithm. J. King Saud Univ.Comput. Inf. Sci. 35(8), 101650 (2023).
Luo, J., Wu, Y. & Mendes, A. B. Modelling of integrated vehicle scheduling and container storage problems in unloading process at an automated container terminal. Comput. Ind. Eng. 94, 32–44 (2016).
Acknowledgements
This work was completed while the author was studying for a master's degree in the School of Information and Electronic Engineering at Shandong Technology and Business University. The work was funded by the project of Yantai City Science and Technology Innovation to help the transformation of old and new kinetic energy. Fund: KXDNY202325 (Corresponding author: Zuoxun Wang).
Author information
Authors and Affiliations
Contributions
C. and W. Preparated of laboratory stand and wrote the main manuscript text. S. and Z. finished Simulation debugging. G. finished manuscript review edition. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License, which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/byncnd/4.0/.
About this article
Cite this article
Cui, C., Wang, Z., Sui, J. et al. An improved RRT behavioral planning method for robots based on PTM algorithm. Sci Rep 14, 21776 (2024). https://doi.org/10.1038/s41598024726164
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598024726164