Hostname: page-component-68c7f8b79f-xmwfq Total loading time: 0 Render date: 2025-12-19T13:18:58.043Z Has data issue: false hasContentIssue false

Deep reinforcement learning-based obstacle avoidance motion planning for redundant manipulator considering the actual shape of obstacles

Published online by Cambridge University Press:  18 December 2025

Qing Yang
Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Ju Chen*
Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Yi Zhang
Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Liang Ge
Affiliation:
School of Mechanical and Electrical Engineering, Southwest Petroleum University, Chengdu, China
Yun Xiao Wang
Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
*
Corresponding author: Ju Chen; Email: wy_cj2024@163.com

Abstract

Although deep reinforcement learning (DRL) techniques have been extensively studied in the field of robotic manipulators, there is limited research on directly mapping the output of policy functions to the joint space of manipulators. This paper proposes a motion planning scheme for redundant manipulators to avoid obstacles based on DRL, considering the actual shapes of obstacles in the environment. This scheme not only accomplishes the path planning task for the end-effector but also enables autonomous obstacle avoidance while obtaining the joint trajectories of the manipulator. First, a reinforcement learning framework based on the joint space is proposed. This framework uses the joint accelerations of the manipulator to calculate the Cartesian coordinates of the end-effector through forward kinematics, thereby performing end-to-end path planning for the end-effector. Second, the distance between all the linkages of the manipulator and irregular obstacles is calculated in real time based on the Gilbert–Johnson–Keerthi distance algorithm. The reward function containing joint acceleration is constructed with this distance to realize the obstacle avoidance task of the redundant manipulator. Finally, simulations and physical experiments were conducted on a 7-degree-of-freedom manipulator, demonstrating that the proposed scheme can generate efficient and collision-free trajectories in environments with irregular obstacles, effectively avoiding collisions.

Information

Type
Research Article
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Xie, A., Chen, T., Zhang, G., Li, Y. and Rong, X., “Manipulability enhancement of legged manipulators by adaptive motion distribution,” IEEE Trans. Ind. Electron. 72(1), 724733 (2024). doi: 10.1109/TIE.2024.3413833.CrossRefGoogle Scholar
Zhang, Z., Liu, X., Ning, M., Li, X., Liu, W. and Lu, Y., “A review of motion planning for redundant space manipulators,” Sci. China Technol. Sci. 68, 1310401 (2025). doi: 10.1007/s11431-024-2841-y.CrossRefGoogle Scholar
Li, Y., Feng, Q., Zhang, Y., Peng, C., Ma, Y., Liu, C., Ru, M., Sun, J. and Zhao, C., “Peduncle collision-free grasping based on deep reinforcement learning for tomato harvesting robot,” Comput. Electron. Agric. 216, 108488–1699 (2024). doi: 10.1016/j.compag.2023.108488. ISSN 0168-1699.CrossRefGoogle Scholar
Yuan, W., Li, Z. and Su, C.-Y., “Multisensor-based navigation and control of a mobile service robot,” IEEE Trans. Syst., Man, Cybern.: Syst. 51(4), 26242634 (2021). doi: 10.1109/TSMC.2019.2916932.CrossRefGoogle Scholar
Yang, H., Li, D., Xu, X. and Zhang, H., “An obstacle avoidance and trajectory tracking algorithm for redundant manipulator end,” IEEE Access 10, 5291252921 (2022). doi: 10.1109/ACCESS.2022.3173404.CrossRefGoogle Scholar
Khan, A. H., Li, S. and Luo, X., “Obstacle avoidance and tracking control of redundant robotic manipulator: An RNN-based metaheuristic approach,” IEEE Trans. Ind. Inform. 16(7), 46704680 (2020). doi: 10.1109/TII.2019.2941916.CrossRefGoogle Scholar
Cong, Y., Du, H., Chen, W., Zhu, W. and Wen, G., “A Lyapunov-based step-by-step sliding-mode observer algorithm with application to joint torque estimation of robot manipulators,” IEEE Trans. Ind. Inform. 21(3), 24512460 (2024). doi: 10.1109/TII.2024.3507208.CrossRefGoogle Scholar
Guo, D. S. and Zhang, Y. N., “Different-level two-norm and infinity-norm minimization to remedy joint-torque instability/divergence for redundant robot manipulators,” Robot. Auton. Syst. 60(6), 874888 (2012). doi: 10.1016/j.robot.2012.01.008. ISSN. 0921-8890.CrossRefGoogle Scholar
Feng, B., Jiang, X., Li, B., Zhou, Q., Bi, Y. and Part, B., “An adaptive multi-RRT approach for robot motion planning,” Expert Syst. Appl. 252, 124281 (2024). doi: 10.1016/j.eswa.2024.124281. ISSN 0957-4174.CrossRefGoogle Scholar
Zhang, J., Wu, J. and Shen, X., “Autonomous land vehicle path planning algorithm based on improved heuristic function of A-star,” Int. J. Adv. Robot. Syst. 18(5), 17298814 (2021). doi: 10.1177/17298814211042730.CrossRefGoogle Scholar
Gammell, J., Barfoot, T. and Srinivasa, S., “Batch informed trees (BIT*): Informed asymptotically optimal anytime search,” Int. J. Robot. Res. 39(5), 543567 (2020). doi: 10.1177/0278364919890396.CrossRefGoogle Scholar
Khan, A. T. and Ijaz, M., “Bio-inspired BAS: Run-time path-planning and the control of differential mobile robot,” EAI Endors. Trans. AI Robot. 110 (2022). doi: 10.4108/airo.v1i.656.Google Scholar
Ju, F., Jin, H., Wang, B. and Zhao, J., “A predictable obstacle avoidance model based on geometric configuration of redundant manipulators for motion planning,” Sensors 23(4642), 4642 (2023). doi: 10.3390/s23104642.CrossRefGoogle ScholarPubMed
Mu, Z. G., Xu, W. F. and Liang, B., “Avoidance of multiple moving obstacles during active debris removal using a redundant space manipulator,” Int. J. Control, Autom. Syst. 15(2), 815826 (2017). doi: 10.1007/s12555-015-0455-7.CrossRefGoogle Scholar
Liu, J., Tong, Y., Ju, Z. and Liu, Y., “Novel method of obstacle avoidance planning for redundant sliding manipulators,” IEEE Access 8, 7860878621 (2020). doi: 10.1109/ACCESS.2020.2990555.CrossRefGoogle Scholar
Ma, B., Xie, Z., Zhan, B., Jiang, Z., Liu, Y. and Liu, H., “Actual shape-based obstacle avoidance synthesized by velocity–Acceleration minimization for redundant manipulators: An optimization perspective,” IEEE Trans. Syst., Man, Cybern.: Syst. 53(10), 64606474 (2023). doi: 10.1109/TSMC.2023.3283266.CrossRefGoogle Scholar
Peng, Y., Tang, B., Huang, D. and Wei, Y., “Manipulator trajectory planning based on clustering curve discretization and B-spline,” J. Field Robot., 15564959 (2024). doi: 10.1002/rob.22485.Google Scholar
Zhang, D., Ju, R. and Cao, Z., “Reinforcement learning-based motion control for snake robots in complex environments,” Robotica 42(4), 947961 (2024). doi: 10.1017/S0263574723001613.CrossRefGoogle Scholar
Khlif, N., Nahla, K. and Safya, B., “Reinforcement learning with modified exploration strategy for mobile robot path planning,” Robotica 41(9), 26882702 (2023). doi: 10.1017/S0263574723000607.CrossRefGoogle Scholar
Bai, Z., Pang, H., He, Z., Zhao, B. and Wang, T., “Path planning of autonomous mobile robot in comprehensive unknown environment using deep reinforcement learning,” IEEE Internet Things 11(12), 2215322166 (2024). doi: 10.1109/JIOT.2024.3379361.CrossRefGoogle Scholar
Gan, X., Huo, Z. and Li, W., “DP-A*: For path planing of UGV and contactless delivery,” IEEE Trans. Intell. Transp. 25(1), 907919 (2024). doi: 10.1109/TITS.2023.3258186.CrossRefGoogle Scholar
Ying, F., Liu, H., Jiang, R. and Yin, X., “Trajectory generation for multiprocess robotic tasks based on nested dual-memory deep deterministic policy gradient,” IEEE/ASME Trans. Mechatron. 27(6), 46434653 (2022). doi: 10.1109/TMECH.2022.3160605.CrossRefGoogle Scholar
Cao, Y. X., Wang, S. J., Zheng, X. and Liu, L., “Planning and Control of Space Robot in Capture Operation Based on Reinforcement Learning,” In: Chinese Control Conference, Shanghai, July 26–28 2021).Google Scholar
Song, B. Y., Li, J. Q., Liu, X. Y. and Wang, G. L., “A trajectory planning method for capture operation of space robotic arm based on deep reinforcement learning,” J. Comput. Inf. Sci. Eng. 24(9), 091003 (2024). doi: 10.1115/1.4065814.CrossRefGoogle Scholar
De Witte, S., Van Hauwermeiren, T., Lefebvre, T. and Crevecoeur, G., “Learning to cooperate: A hierarchical cooperative dual robot arm approach for underactuated pick-and-placing,” IEEE/ASME Trans. Mechatron. 27(4), 19641972 (2022). doi: 10.1109/TMECH.2022.3175484.CrossRefGoogle Scholar
Sui, Z., Pu, Z., Yi, J. and Wu, S., “Formation control with collision avoidance through deep reinforcement learning using model-guided demonstration,” IEEE Trans. Neur. Netw. Learn. 32(6), 23582372 (2021). doi: 10.1109/TNNLS.2020.3004893.CrossRefGoogle ScholarPubMed
Dai, J., Zhang, Y. and Deng, H., “Novel potential guided bidirectional RRT* with direct connection strategy for path planning of redundant robot manipulators in joint space,” IEEE Trans. Ind. Electron. 71(3), 27372747 (2024). doi: 10.1109/TIE.2023.3269462.CrossRefGoogle Scholar
Liu, H., Ying, F., Jiang, R., Shan, Y. and Shen, B., “Obstacle-avoidable robotic motion planning framework based on deep reinforcement learning,” IEEE/ASME Trans. Mechatron. 29(6), 43774388 (2024). doi: 10.1109/TMECH.2024.3377002.CrossRefGoogle Scholar
Zhang, Y. N., Guo, D. S. and Ma, S. G., “Different-level simultaneous minimization of joint-velocity and joint-torque for redundant robot manipulators,” J. Intell. Robot. Syst.: Theory Appl. 72(3-4), 301323 (2013). doi: 10.1007/s10846-013-9816-8 CrossRefGoogle Scholar
Gilbert, E. G., Johnson, D. W. and Keerthi, S. S., “A fast procedure for computing the distance between complex objects in three-dimensional space,” IEEE J. Robot. Autom. 4(2), 193203 (1988). doi: 10.1109/56.2083.CrossRefGoogle Scholar
Montanari, M., Petrinic, N. and Barbieri, E., “Improving the GJK algorithm for faster and more reliable distance queries between convex objects,” ACM Trans. Graph 36(3), 30 (2017). doi: 10.1145/3083724.CrossRefGoogle Scholar
Wang, C., Frazelle, C. G., Wagner, J. R. and Walker, I. D., “Dynamic control of multisection three-dimensional continuum manipulators based on virtual discrete-jointed robot models,” IEEE/ASME Trans. Mechatron. 26(2), 777788 (2021). doi: 10.1109/TMECH.2020.2999847.CrossRefGoogle Scholar