Deep reinforcement learning-based obstacle avoidance motion planning for redundant manipulator considering the actual shape of obstacles

Qing Yang; Ju Chen; Yi Zhang; Liang Ge; Yun Xiao Wang

doi:10.1017/S0263574725102993

Deep reinforcement learning-based obstacle avoidance motion planning for redundant manipulator considering the actual shape of obstacles

Published online by Cambridge University Press: 18 December 2025

Qing Yang ,

Ju Chen

Yi Zhang ,

Liang Ge and

Yun Xiao Wang

Show author details

Qing Yang: Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Ju Chen*: Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Yi Zhang: Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
Liang Ge: Affiliation:
School of Mechanical and Electrical Engineering, Southwest Petroleum University, Chengdu, China
Yun Xiao Wang: Affiliation:
School of Electrical Engineering and Information, Southwest Petroleum University, Chengdu, China
*: Corresponding author: Ju Chen; Email: wy_cj2024@163.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Although deep reinforcement learning (DRL) techniques have been extensively studied in the field of robotic manipulators, there is limited research on directly mapping the output of policy functions to the joint space of manipulators. This paper proposes a motion planning scheme for redundant manipulators to avoid obstacles based on DRL, considering the actual shapes of obstacles in the environment. This scheme not only accomplishes the path planning task for the end-effector but also enables autonomous obstacle avoidance while obtaining the joint trajectories of the manipulator. First, a reinforcement learning framework based on the joint space is proposed. This framework uses the joint accelerations of the manipulator to calculate the Cartesian coordinates of the end-effector through forward kinematics, thereby performing end-to-end path planning for the end-effector. Second, the distance between all the linkages of the manipulator and irregular obstacles is calculated in real time based on the Gilbert–Johnson–Keerthi distance algorithm. The reward function containing joint acceleration is constructed with this distance to realize the obstacle avoidance task of the redundant manipulator. Finally, simulations and physical experiments were conducted on a 7-degree-of-freedom manipulator, demonstrating that the proposed scheme can generate efficient and collision-free trajectories in environments with irregular obstacles, effectively avoiding collisions.

Keywords

deep reinforcement learning distance algorithm motion planning composite reward function

Information

Type: Research Article
Information: Robotica , First View , pp. 1 - 19

DOI: https://doi.org/10.1017/S0263574725102993 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Xie, A., Chen, T., Zhang, G., Li, Y. and Rong, X., “Manipulability enhancement of legged manipulators by adaptive motion distribution,” IEEE Trans. Ind. Electron. 72(1), 724–733 (2024). doi: 10.1109/TIE.2024.3413833.CrossRef Google Scholar

Zhang, Z., Liu, X., Ning, M., Li, X., Liu, W. and Lu, Y., “A review of motion planning for redundant space manipulators,” Sci. China Technol. Sci. 68, 1310401 (2025). doi: 10.1007/s11431-024-2841-y.CrossRef Google Scholar

Li, Y., Feng, Q., Zhang, Y., Peng, C., Ma, Y., Liu, C., Ru, M., Sun, J. and Zhao, C., “Peduncle collision-free grasping based on deep reinforcement learning for tomato harvesting robot,” Comput. Electron. Agric. 216, 108488–1699 (2024). doi: 10.1016/j.compag.2023.108488. ISSN 0168-1699.CrossRef Google Scholar

Yuan, W., Li, Z. and Su, C.-Y., “Multisensor-based navigation and control of a mobile service robot,” IEEE Trans. Syst., Man, Cybern.: Syst. 51(4), 2624–2634 (2021). doi: 10.1109/TSMC.2019.2916932.CrossRef Google Scholar

Yang, H., Li, D., Xu, X. and Zhang, H., “An obstacle avoidance and trajectory tracking algorithm for redundant manipulator end,” IEEE Access 10, 52912–52921 (2022). doi: 10.1109/ACCESS.2022.3173404.CrossRef Google Scholar

Khan, A. H., Li, S. and Luo, X., “Obstacle avoidance and tracking control of redundant robotic manipulator: An RNN-based metaheuristic approach,” IEEE Trans. Ind. Inform. 16(7), 4670–4680 (2020). doi: 10.1109/TII.2019.2941916.CrossRef Google Scholar

Cong, Y., Du, H., Chen, W., Zhu, W. and Wen, G., “A Lyapunov-based step-by-step sliding-mode observer algorithm with application to joint torque estimation of robot manipulators,” IEEE Trans. Ind. Inform. 21(3), 2451–2460 (2024). doi: 10.1109/TII.2024.3507208.CrossRef Google Scholar

Guo, D. S. and Zhang, Y. N., “Different-level two-norm and infinity-norm minimization to remedy joint-torque instability/divergence for redundant robot manipulators,” Robot. Auton. Syst. 60(6), 874–888 (2012). doi: 10.1016/j.robot.2012.01.008. ISSN. 0921-8890.CrossRef Google Scholar

Feng, B., Jiang, X., Li, B., Zhou, Q., Bi, Y. and Part, B., “An adaptive multi-RRT approach for robot motion planning,” Expert Syst. Appl. 252, 124281 (2024). doi: 10.1016/j.eswa.2024.124281. ISSN 0957-4174.CrossRef Google Scholar

Zhang, J., Wu, J. and Shen, X., “Autonomous land vehicle path planning algorithm based on improved heuristic function of A-star,” Int. J. Adv. Robot. Syst. 18(5), 1729–8814 (2021). doi: 10.1177/17298814211042730.CrossRef Google Scholar

Gammell, J., Barfoot, T. and Srinivasa, S., “Batch informed trees (BIT^*): Informed asymptotically optimal anytime search,” Int. J. Robot. Res. 39(5), 543–567 (2020). doi: 10.1177/0278364919890396.CrossRef Google Scholar

Khan, A. T. and Ijaz, M., “Bio-inspired BAS: Run-time path-planning and the control of differential mobile robot,” EAI Endors. Trans. AI Robot. 1–10 (2022). doi: 10.4108/airo.v1i.656.Google Scholar

Ju, F., Jin, H., Wang, B. and Zhao, J., “A predictable obstacle avoidance model based on geometric configuration of redundant manipulators for motion planning,” Sensors 23(4642), 4642 (2023). doi: 10.3390/s23104642.CrossRef Google Scholar PubMed

Mu, Z. G., Xu, W. F. and Liang, B., “Avoidance of multiple moving obstacles during active debris removal using a redundant space manipulator,” Int. J. Control, Autom. Syst. 15(2), 815–826 (2017). doi: 10.1007/s12555-015-0455-7.CrossRef Google Scholar

Liu, J., Tong, Y., Ju, Z. and Liu, Y., “Novel method of obstacle avoidance planning for redundant sliding manipulators,” IEEE Access 8, 78608–78621 (2020). doi: 10.1109/ACCESS.2020.2990555.CrossRef Google Scholar

Ma, B., Xie, Z., Zhan, B., Jiang, Z., Liu, Y. and Liu, H., “Actual shape-based obstacle avoidance synthesized by velocity–Acceleration minimization for redundant manipulators: An optimization perspective,” IEEE Trans. Syst., Man, Cybern.: Syst. 53(10), 6460–6474 (2023). doi: 10.1109/TSMC.2023.3283266.CrossRef Google Scholar

Peng, Y., Tang, B., Huang, D. and Wei, Y., “Manipulator trajectory planning based on clustering curve discretization and B-spline,” J. Field Robot., 1556–4959 (2024). doi: 10.1002/rob.22485.Google Scholar

Zhang, D., Ju, R. and Cao, Z., “Reinforcement learning-based motion control for snake robots in complex environments,” Robotica 42(4), 947–961 (2024). doi: 10.1017/S0263574723001613.CrossRef Google Scholar

Khlif, N., Nahla, K. and Safya, B., “Reinforcement learning with modified exploration strategy for mobile robot path planning,” Robotica 41(9), 2688–2702 (2023). doi: 10.1017/S0263574723000607.CrossRef Google Scholar

Bai, Z., Pang, H., He, Z., Zhao, B. and Wang, T., “Path planning of autonomous mobile robot in comprehensive unknown environment using deep reinforcement learning,” IEEE Internet Things 11(12), 22153–22166 (2024). doi: 10.1109/JIOT.2024.3379361.CrossRef Google Scholar

Gan, X., Huo, Z. and Li, W., “DP-A^*: For path planing of UGV and contactless delivery,” IEEE Trans. Intell. Transp. 25(1), 907–919 (2024). doi: 10.1109/TITS.2023.3258186.CrossRef Google Scholar

Ying, F., Liu, H., Jiang, R. and Yin, X., “Trajectory generation for multiprocess robotic tasks based on nested dual-memory deep deterministic policy gradient,” IEEE/ASME Trans. Mechatron. 27(6), 4643–4653 (2022). doi: 10.1109/TMECH.2022.3160605.CrossRef Google Scholar

Cao, Y. X., Wang, S. J., Zheng, X. and Liu, L., “Planning and Control of Space Robot in Capture Operation Based on Reinforcement Learning,” In: Chinese Control Conference, Shanghai, July 26–28 2021).Google Scholar

Song, B. Y., Li, J. Q., Liu, X. Y. and Wang, G. L., “A trajectory planning method for capture operation of space robotic arm based on deep reinforcement learning,” J. Comput. Inf. Sci. Eng. 24(9), 091003 (2024). doi: 10.1115/1.4065814.CrossRef Google Scholar

De Witte, S., Van Hauwermeiren, T., Lefebvre, T. and Crevecoeur, G., “Learning to cooperate: A hierarchical cooperative dual robot arm approach for underactuated pick-and-placing,” IEEE/ASME Trans. Mechatron. 27(4), 1964–1972 (2022). doi: 10.1109/TMECH.2022.3175484.CrossRef Google Scholar

Sui, Z., Pu, Z., Yi, J. and Wu, S., “Formation control with collision avoidance through deep reinforcement learning using model-guided demonstration,” IEEE Trans. Neur. Netw. Learn. 32(6), 2358–2372 (2021). doi: 10.1109/TNNLS.2020.3004893.CrossRef Google Scholar PubMed

Dai, J., Zhang, Y. and Deng, H., “Novel potential guided bidirectional RRT* with direct connection strategy for path planning of redundant robot manipulators in joint space,” IEEE Trans. Ind. Electron. 71(3), 2737–2747 (2024). doi: 10.1109/TIE.2023.3269462.CrossRef Google Scholar

Liu, H., Ying, F., Jiang, R., Shan, Y. and Shen, B., “Obstacle-avoidable robotic motion planning framework based on deep reinforcement learning,” IEEE/ASME Trans. Mechatron. 29(6), 4377–4388 (2024). doi: 10.1109/TMECH.2024.3377002.CrossRef Google Scholar

Zhang, Y. N., Guo, D. S. and Ma, S. G., “Different-level simultaneous minimization of joint-velocity and joint-torque for redundant robot manipulators,” J. Intell. Robot. Syst.: Theory Appl. 72(3-4), 301–323 (2013). doi: 10.1007/s10846-013-9816-8 CrossRef Google Scholar

Gilbert, E. G., Johnson, D. W. and Keerthi, S. S., “A fast procedure for computing the distance between complex objects in three-dimensional space,” IEEE J. Robot. Autom. 4(2), 193–203 (1988). doi: 10.1109/56.2083.CrossRef Google Scholar

Montanari, M., Petrinic, N. and Barbieri, E., “Improving the GJK algorithm for faster and more reliable distance queries between convex objects,” ACM Trans. Graph 36(3), 30 (2017). doi: 10.1145/3083724.CrossRef Google Scholar

Wang, C., Frazelle, C. G., Wagner, J. R. and Walker, I. D., “Dynamic control of multisection three-dimensional continuum manipulators based on virtual discrete-jointed robot models,” IEEE/ASME Trans. Mechatron. 26(2), 777–788 (2021). doi: 10.1109/TMECH.2020.2999847.CrossRef Google Scholar

Article contents

Deep reinforcement learning-based obstacle avoidance motion planning for redundant manipulator considering the actual shape of obstacles

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests