No CrossRef data available.
Published online by Cambridge University Press: 17 September 2025
Visual Simultaneous Localization and Mapping (vSLAM) is essentially limited by the static world assumption, which makes its application in dynamic environments challenging. This paper proposes a robust vSLAM system, RFN-SLAM, which is based on ORB-SLAM3 and does not require preset dynamic labels and weighted features to process dynamic scenes. In the feature extraction stage, an enhanced efficient binary image BAD descriptor is used to improve the accuracy of static feature point matching. Through the improved RT-DETR target detection network and FAST-SAM instance segmentation network, RFN-SLAM obtains semantic information and uses a novel dynamic box detection algorithm to identify and eliminate the feature points of dynamic objects. When optimizing the pose, the static feature points are weighted according to the dynamic information, which significantly reduces the mismatch and improves the accuracy of positioning. Meanwhile, 3D rendering of the neural radiation field is used to remove dynamic objects and render them. Experiments were conducted on the TUM RGB-D dataset, Bonn dataset, and self-collected dataset. The results show that in terms of positioning accuracy, RFN-SLAM significantly outperforms ORB-SLAM3 in dynamic environments. It also achieves more accurate positioning than other advanced dynamic SLAM methods and successfully realizes accurate 3D reconstruction of static scenes. In addition, on the premise of ensuring accuracy, the real-time performance of RFN-SLAM is effectively guaranteed.