A robust visual simultaneous localization and mapping system for dynamic environments without predefined dynamic labels and weighted features

Shuai Xiang; Chaoyi Dong; Kang Zhang; Ge Tai; Tianyu Yuan; Haoda Yan; Xiaoyan Chen

doi:10.1017/S0263574725102506

A robust visual simultaneous localization and mapping system for dynamic environments without predefined dynamic labels and weighted features

Published online by Cambridge University Press: 17 September 2025

Ge Tai ,

Haoda Yan and

Shuai Xiang: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China
Chaoyi Dong*: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China Engineering Research Center of Large Energy Storage Technology, Ministry of Education, Hohhot, 010010, China
Kang Zhang: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China
Ge Tai: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China
Tianyu Yuan: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China
Haoda Yan: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China
Xiaoyan Chen: Affiliation:
College of Electric Power, Inner Mongolia University of Technology, Hohhot, 010080, China Intelligent Energy Technology and Equipment Engineering Research Centre of Colleges, Universities in Inner Mongolia Autonomous Region, Hohhot, 010051, China Engineering Research Center of Large Energy Storage Technology, Ministry of Education, Hohhot, 010010, China
*: Corresponding author: Chaoyi Dong; Email: dongchaoyi@imut.edu.cn

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Visual Simultaneous Localization and Mapping (vSLAM) is essentially limited by the static world assumption, which makes its application in dynamic environments challenging. This paper proposes a robust vSLAM system, RFN-SLAM, which is based on ORB-SLAM3 and does not require preset dynamic labels and weighted features to process dynamic scenes. In the feature extraction stage, an enhanced efficient binary image BAD descriptor is used to improve the accuracy of static feature point matching. Through the improved RT-DETR target detection network and FAST-SAM instance segmentation network, RFN-SLAM obtains semantic information and uses a novel dynamic box detection algorithm to identify and eliminate the feature points of dynamic objects. When optimizing the pose, the static feature points are weighted according to the dynamic information, which significantly reduces the mismatch and improves the accuracy of positioning. Meanwhile, 3D rendering of the neural radiation field is used to remove dynamic objects and render them. Experiments were conducted on the TUM RGB-D dataset, Bonn dataset, and self-collected dataset. The results show that in terms of positioning accuracy, RFN-SLAM significantly outperforms ORB-SLAM3 in dynamic environments. It also achieves more accurate positioning than other advanced dynamic SLAM methods and successfully realizes accurate 3D reconstruction of static scenes. In addition, on the premise of ensuring accuracy, the real-time performance of RFN-SLAM is effectively guaranteed.

Keywords

vSLAM dynamic environment weighted features semantic information neural radiation field

Information

Type: Research Article
Information: Robotica , First View , pp. 1 - 29

DOI: https://doi.org/10.1017/S0263574725102506 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Wang, X., Fan, X., Shi, P., Ni, J. and Zhou, Z., “An overview of key SLAM technologies for underwater scenes,” Remote Sens-BASEL 15(10), 2496 (2023).Google Scholar

Sharafutdinov, D., Griguletskii, M., Kopanev, P., Kurenkov, M., Ferrer, G., Burkov, A., Gonnochenko, A. and Tsetserukou, D., “Comparison of modern open-source visual SLAM approaches,” J. Intell. Rob. Syst. 107(3), 43 (2023).Google Scholar

Klein, G. and Murray, D., “Parallel Tracking and Mapping for Small AR Workspaces,” 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, 13 Nov 2007 (IEEE, 2007) pp. 225–234.10.1109/ISMAR.2007.4538852CrossRef Google Scholar

Engel, J., Schöps, T. and Cremers, D., “LSD-SLAM: Large-Scale Direct Monocular SLAM,” European Conference on Computer Vision, 6 Sep 2014 (Springer International Publishing, Cham, 2014) pp. 834–849.Google Scholar

Sun, Y., Liu, M. and Meng, M. Q., “Improving RGB-D SLAM in dynamic environments: A motion removal approach,” Rob. Auton. Syst. 1(89), 110–122 (2017).10.1016/j.robot.2016.11.012CrossRef Google Scholar

Mur-Artal, R., Montiel, J. M. and Tardos, J. D., “ORB-SLAM: A versatile and accurate monocular SLAM system,” IEEE Trans. Rob. 31(5), 1147–1163 (2015).10.1109/TRO.2015.2463671CrossRef Google Scholar

Mur-Artal, R. and Tardós, J. D., “Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras,” IEEE Trans. Rob. 33(5), 1255–1262 (2017).10.1109/TRO.2017.2705103CrossRef Google Scholar

Campos, C., Elvira, R., Rodríguez, J. J., Montiel, J. M. and Tardós, J. D., “Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam,” IEEE Trans. Rob. 37(6), 1874–1890 (2021).CrossRef Google Scholar

Zhang, H., Ye, F., Lai, Y., Li, K. and Xu, J., “IQ-VIO: Adaptive visual inertial odometry via interference quantization under dynamic environments,” Intell. Serv. Rob. 16(5), 565–581 (2023).Google Scholar

Zhao, X., Zuo, T. and Hu, X., “OFM-SLAM: A visual semantic SLAM for dynamic indoor environments,” Math. Probl. Eng. 2021(1), 5538840 (2021).Google Scholar

Gonzalez, M., Marchand, E., Kacete, A. and Royan, J., “Twistslam: Constrained slam in dynamic environment,” IEEE Rob. Autom. Lett. 7(3), 6846–6853 (2022).Google Scholar

Bescos, B., Fácil, J. M., Civera, J. and Neira, J., “DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes,” IEEE Rob. Autom. Lett. 3(4), 4076–4083 (2018).10.1109/LRA.2018.2860039CrossRef Google Scholar

Yu, C., Liu, Z., Liu, X. J., Xie, F., Yang, Y., Wei, Q. and Fei, Q., “DS-SLAM: A Semantic Visual SLAM Towards Dynamic Environments,” 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 1 Oct 2018 (IEEE, 2018) pp. 1168–1174.Google Scholar

Pan, Z., Hou, J. and Yu, L., “Optimization RGB-D 3-D reconstruction algorithm based on dynamic SLAM,” IEEE Trans. Instrum. Meas. 23(72), 1–3 (2023).Google Scholar

Cheng, S., Sun, C., Zhang, S. and Zhang, D., “SG-SLAM: A real-time RGB-D visual SLAM toward dynamic scenes with semantic and geometric information,” IEEE Trans. Instrum. Meas. 72, 1–2 (2022).10.1109/TIM.2023.3326234CrossRef Google Scholar

Müller, T., Evans, A., Schied, C. and Keller, A., “Instant neural graphics primitives with a multiresolution hash encoding,” ACM Trans. Graphics (TOG) 41(4), 1–15 (2022).Google Scholar

Mildenhall, B., Srinivasan, P. P., Tancik, M., Barron, J. T., Ramamoorthi, R. and Ng, R., “Nerf: Representing scenes as neural radiance fields for view synthesis,” Commun. Acm 65(1), 99–106 (2021).CrossRef Google Scholar

Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., Liu, Y. and Chen, J., “Detrs Beat Yolos on Real-Time Object Detection,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024) pp. 16965–16974.Google Scholar

Zhao, X., Ding, W., An, Y., Du, Y., Yu, T., Li, M., Tang, M. and Wang, J., “Fast segment anything. arxiv preprint arxiv:2306.12156 (2023).Google Scholar

Suárez, I., Buenaposada, J. M. and Baumela, L., “Revisiting binary local image description for resource limited devices,” IEEE Rob. Autom. Lett. 6(4), 8317–8324 (2021).10.1109/LRA.2021.3107024CrossRef Google Scholar

Kim, D. H. and Kim, J. H., “Effective background model-based RGB-D dense visual odometry in a dynamic environment,” IEEE Trans. Rob. 32(6), 1565–1573 (2016).10.1109/TRO.2016.2609395CrossRef Google Scholar

Wang, R., Wan, W., Wang, Y. and Di, K., “A new RGB-D SLAM method with moving object detection for dynamic indoor scenes,” Remote Sens-BASEL 11(10), 1143 (2019).10.3390/rs11101143CrossRef Google Scholar

Song, B., Yuan, X., Ying, Z., Yang, B., Song, Y. and Zhou, F., “DGM-VINS: Visual–inertial SLAM for complex dynamic environments with joint geometry feature extraction and multiple object tracking,” IEEE Trans. Instrum. Meas. 72, 1–1 (2023).Google Scholar

Dai, W., Zhang, Y., Li, P., Fang, Z. and Scherer, S., “Rgb-d slam in dynamic environments using point correlations,” IEEE Trans. Pattern Anal. 44(1), 373–389 (2020).10.1109/TPAMI.2020.3010942CrossRef Google Scholar

Soares, J. C., Gattass, M. and Meggiolaro, M. A., “Crowd-SLAM: Visual SLAM towards crowded environments using object detection,” J. Intell. Rob. Syst. 102(2), 50 (2021).CrossRef Google Scholar

Ji, T., Wang, C. and Xie, L., “Towards Real-Time Semantic RGB-D SLAM in Dynamic Environments,” 2021 IEEE International Conference on Robotics and Automation (ICRA), 30 May 2021 (IEEE) pp. 11175–11181.Google Scholar

Zhang, K., Dong, C., Guo, H., Ye, Q., Gao, L., Xiang, S., Chen, X. and Wu, Y., “A semantic visual SLAM based on improved mask R-CNN in dynamic environment,” Robotica 42(10), 3570–3591 (2024).10.1017/S0263574724001553CrossRef Google Scholar

Yu, X., Zheng, W. and Ou, L., “CPR-SLAM: RGB-D SLAM in dynamic environment using sub-point cloud correlations,” Robotica 42(7), 2367–2387 (2024).Google Scholar

Kenye, L. and Kala, R., “Improving RGB-D SLAM in dynamic environments using semantic aided segmentation,” Robotica 40(6), 2065–2090 (2022).CrossRef Google Scholar

Li, S., Gu, J., Li, Z., Li, S., Guo, B., Gao, S., Zhao, F., Yang, Y., Li, G. and Dong, L., “A visual SLAM-based lightweight multi-modal semantic framework for an intelligent substation robot,” Robotica 42(7), 2169–2183 (2024).10.1017/S0263574724000511CrossRef Google Scholar

Fan, Y., Zhang, Q., Tang, Y., Liu, S. and Han, H., “Blitz-SLAM: A semantic SLAM in dynamic environments,” Pattern Recognit. 121, 108225 (2022).10.1016/j.patcog.2021.108225CrossRef Google Scholar

He, J., Li, M., Wang, Y. and Wang, H., “OVD-SLAM: An online visual SLAM for dynamic environments,” IEEE Sens. J. 23(12), 13210–13219 (2023).10.1109/JSEN.2023.3270534CrossRef Google Scholar

Du, Z. J., Huang, S. S., Mu, T. J., Zhao, Q., Martin, R. R. and Xu, K., “Accurate dynamic SLAM using CRF-based long-term consistency,” IEEE Trans. Visual. Comput. Graphics 28(4), 1745–1757 (2020).10.1109/TVCG.2020.3028218CrossRef Google Scholar

Zhang, Q. and Li, C., “Semantic SLAM for mobile robots in dynamic environments based on visual camera sensors,” Meas. Sci. Technol. 34(8), 085202 (2023).Google Scholar

Li, S. and Lee, D., “RGB-D SLAM in dynamic environments using static point weighting,” IEEE Rob. Autom. Lett. 2(4), 2263–2270 (2017).CrossRef Google Scholar

Zhong, Y., Hu, S., Huang, G., Bai, L. and Li, Q., “WF-SLAM: A robust VSLAM for dynamic scenarios via weighted features,” IEEE Sens. J. 22(11), 10818–10827 (2022).10.1109/JSEN.2022.3169340CrossRef Google Scholar

Zhang, J., Henein, M., Mahony, R. and Ila, V., “VDO-SLAM: A visual dynamic object-aware SLAM system.” arxiv preprint arxiv:2005.11052 (2020).Google Scholar

Yao, Y., Luo, Z., Li, S., Fang, T. and Quan, L., “Mvsnet: Depth Inference for Unstructured Multi-view Stereo,” Proceedings of the European Conference on Computer Vision (ECCV) (2018) pp. 767–783.Google Scholar

Yao, Y., Luo, Z., Li, S., Shen, T., Fang, T. and Quan, L., “Recurrent MVSNet for High-Resolution Multi-view Stereo Depth Inference,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019) pp. 5525–5534.Google Scholar

Kerbl, B., Kopanas, G., Leimkühler, T. and Drettakis, G., “3d gaussian splatting for real-time radiance field rendering,” ACM Trans. Graph 42(4), 139–131 (2023).10.1145/3592433CrossRef Google Scholar

Liu, X., Peng, H., Zheng, N., Yang, Y., Hu, H. and Yuan, Y., “Efficientvit: Memory Efficient Vision Transformer with Cascaded Group Attention,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023) pp. 14420–14430.Google Scholar

Sturm, J., Engelhard, N., Endres, F., Burgard, W. and Cremers, D., “A Benchmark for the Evaluation of RGB-D SLAM Systems,” 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IEEE, 2012) pp. 573–580.10.1109/IROS.2012.6385773CrossRef Google Scholar

Palazzolo, E., Behley, J., Lottes, P., Giguere, P. and Stachniss, C., “ReFusion: 3D Reconstruction in Dynamic Environments for RGB-D Cameras Exploiting Residuals,” 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2019) pp. 7855–7862.10.1109/IROS40897.2019.8967590CrossRef Google Scholar

Article contents

A robust visual simultaneous localization and mapping system for dynamic environments without predefined dynamic labels and weighted features

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests