Hostname: page-component-cb9f654ff-qc88w Total loading time: 0 Render date: 2025-09-09T17:28:17.297Z Has data issue: false hasContentIssue false

Precision weed detection and mapping in vegetables using deep learning

Published online by Cambridge University Press:  08 September 2025

Weili Li
Affiliation:
Assistant Professor, Jincheng College, Nanjing University of Aeronautics and Astronautics, Nanjing, China; current: Visiting Scholar, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Wenpeng Zhu
Affiliation:
Intern, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Jinxu Wang
Affiliation:
Research Assistant, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Kang Han
Affiliation:
Research Assistant, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
Xiaojun Jin*
Affiliation:
Associate Professor, National Engineering Research Center of Biomaterials, Nanjing Forestry University, Nanjing, China
Jialin Yu
Affiliation:
Professor and Principal Investigator, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences in Weifang, Shandong, China
*
Corresponding author: Xiaojun Jin; Email: xiaojunjin@njfu.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Precision weed detection and mapping in vegetable crops are beneficial for improving the effectiveness of weed control. This study proposes a novel method for indirect weed detection and mapping using a detection network based on the You-Only-Look-Once-v8 (YOLOv8) architecture. This approach detects weeds by first identifying vegetables and then segmenting weeds from the background using image processing techniques. Subsequently, weed mapping was established and innovative path planning algorithms were implemented to optimize actuator trajectories along the shortest possible path. Experimental results demonstrated significant improvements in both precision and computational efficiency compared with the original YOLOv8 network. The mean average precision at 0.5 (mAP50) increased by 0.2, while the number of parameters, giga floating-point operations per second (GFLOPS), and model size decreased by 0.57 million, 1.8 GFLOPS, and 1.1 MB, respectively, highlighting enhanced accuracy and reduced computational costs. Among the analyzed path planning algorithms, including Christofides, Dijkstra, and dynamic programming (DP), the Dijkstra algorithm was the most efficient, producing the shortest path for guiding the weeding system. This method enhances the robustness and adaptability of weed detection by eliminating the need to detect diverse weed species. By integrating precision weed mapping and efficient path planning, mechanical actuators can target weed-infested areas with optimal precision. This approach offers a scalable solution that can be adapted to other precision weeding applications.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Weed Science Society of America

Introduction

Vegetables are recognized as nutrient-dense foods, rich in essential vitamins, minerals, and antioxidants (Kumar et al. Reference Kumar, Kumar and Shekhar2020). Vegetables account for approximately 35% of per capita dietary intake in China, making it the world’s largest consumer of vegetables (Dong et al. Reference Dong, Gruda, Li, Cai, Zhang and Duan2022). Weeds pose a significant challenge by competing with vegetables for sunlight, water, and nutrients (Berge et al. Reference Berge, Aastveit and Fykse2008; Hamuda et al. Reference Hamuda, Glavin and Jones2016). Manual weeding, while effective, is both labor-intensive and time-consuming (Slaughter et al. Reference Slaughter, Giles and Downey2008). The development of automated weeding technologies offers a promising solution to these challenges (Memon et al. Reference Memon, Chen, Shen, Liang, Tang, Wang, Zhou and Memon2025).

Extensive research has been conducted on machine vision technologies for weed detection (Bakhshipour et al. Reference Bakhshipour, Jafari, Nassiri and Zare2017; Gerhards et al. Reference Gerhards, Andujar Sanchez, Hamouz, Peteinatos, Christensen and Fernandez-Quintanilla2022; Pantazi et al. Reference Pantazi, Moshou and Bravo2016; Perez et al. Reference Perez, Lopez, Benlloch and Christensen2020). These technologies typically classify weed and crop features into four categories: color, shape, texture, and spectra (Chen et al. Reference Chen, Liu, Han, Jin, Wang, Kong and Yu2024; Kong et al. Reference Kong, Li, Liu, Han, Jin, Chen and Yu2024). While these methods perform well under controlled conditions, their effectiveness often diminishes in field environments due to challenges such as leaf overlap and occlusion (Jin et al. Reference Jin, Sun, Che, Bagavathiannan, Yu and Chen2022c; Tao and Wei Reference Tao and Wei2024). Furthermore, vision-based approaches rely heavily on manually designed features, which introduces subjectivity and limits robustness, especially given the high similarity between weeds and crops (Hasan et al. Reference Hasan, Sohel, Diepeveen, Laga and Jones2021; Jin et al. Reference Jin, Liu, Yang, Xie, Bagavathiannan, Hong, Xu, Chen, Yu and Chen2023).

The rapid advancements in graphics processing units (GPUs) have significantly accelerated the evolution of deep learning (Jordan and Mitchell Reference Jordan and Mitchell2015; Mahesh Reference Mahesh2020). With powerful learning and generalization capabilities, deep learning has been widely adopted for image identification (LeCun et al. Reference LeCun, Bengio and Hinton2015; Pak and Kim Reference Pak and Kim2017), speech recognition (Zhang et al. Reference Zhang, Geiger, Pohjalainen, Mousa, Jin and Schuller2018), natural language processing (Otter et al. Reference Otter, Medina and Kalita2020), and autonomous driving (Grigorescu et al. Reference Grigorescu, Trasnea, Cocias and Macesanu2020). The capacity to process massive datasets and leverage high-performance computing makes deep learning particularly well suited for deciphering, measuring, and understanding data-intensive agricultural processes (Liakos et al. Reference Liakos, Busato, Moshou, Pearson and Bochtis2018). In agriculture, deep learning has been applied to a wide range of tasks, including yield prediction (Liu et al. Reference Liu, Abbas and Noor2021), disease detection (Chung et al. Reference Chung, Huang, Chen, Lai, Chen and Kuo2016), weed detection (Grinblat et al. Reference Grinblat, Uzal, Larese and Granitto2016), crop quality (Peng et al. Reference Peng, Li, Zhou and Shao2022), species recognition (Jin et al. Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022b), and more (Pantazi et al. Reference Pantazi, Moshou and Bravo2016; Sengupta and Lee Reference Sengupta and Lee2014). These advancements highlight the transformative potential of deep learning in modern agriculture, offering innovative solutions to complex challenges.

Numerous studies have been conducted on the use of deep convolutional neural networks (DCNNs) for precise weed detection (Rai et al. Reference Rai, Zhang, Ram, Schumacher, Yellavajjala, Bajwa and Sun2023; Xu et al. Reference Xu, Shu, Xie, Song, Zhu, Cao and Ni2023). For instance, Modi et al. (Reference Modi, Kancheti, Subeesh, Raj, Singh, Chandel, Dhimate, Singh and Singh2023) trained six models with varying hyperparameters to identify weeds in actively growing sugarcane (Saccharum officinarum L.) crops. Among these, DarkNet53 outperformed the other models with a high F1 score greater than 99%. Dyrmann et al. (Reference Dyrmann, Karstoft and Midtiby2016) proposed a new network, which was trained and tested on images from various datasets under different lighting conditions and soil types. This network achieved an 82% accuracy rate in classifying 22 species of weeds. The capability of deep learning for precision weed detection in turf was first reported by Yu et al. (Reference Yu, Sharpe, Schumann and Boyd2019c). Three DCNNs were trained to detect broadleaf weeds in turfgrass, with VGGNet emerging as the best-performing model, achieving both an F1 score and overall accuracy exceeding 0.99, and a recall value of 1.00. A series of additional studies have further compared and analyzed weed detection using DCNNs from various perspectives (Jin et al. Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022a, Reference Jin, Bagavathiannan, McCullough, Chen and Yu2022b; Yu et al. Reference Yu, Schumann, Sharpe, Li and Boyd2019a, Reference Yu, Schumann, Sharpe, Li and Boyd2020), consistently demonstrating the potential of DCNNs in precision weed detection.

Despite significant advancements in deep learning methods for weed detection, several challenges remain. Natural environments often contain diverse weed species, ecotypes, densities, and growth stages, making it difficult to establish comprehensive weed datasets (Pei et al. Reference Pei, Sun, Huang, Zhang, Sheng and Zhang2022; Zhuang et al. Reference Zhuang, Li, Bagavathiannan, Jin, Yang, Meng, Li, Li, Wang, Chen and Yu2022). Additionally, weeds exhibit distinct appearance characteristics at different growth stages and densities, even within the same field. Direct weed detection requires the collection of a massive number of weed images, which often results in reduced robustness and generalization capabilities in detection systems. To address these challenges, this research proposes a novel deep learning method for weed detection and mapping. Vegetables are first detected using an innovative network based on the YOLOv8 architecture, and the remaining green vegetation (weeds) is subsequently segmented using image processing techniques. The objectives of this research were to (1) evaluate the performance of the improved vegetable detection network (IVD), (2) segment weeds from the background images and establish a weed mapping system for precision weeding application, and (3) evaluate the effectiveness of path planning algorithms to guide the operation of weeding actuators.

Materials and Methods

Overview

This study focuses on developing and applying the IVD network based on the YOLOv8 architecture to detect bok choy [Brassica rapa ssp. chinensis (L.) Hanelt]. Bok choy is a fast-growing leafy vegetable that is widely cultivated in Asia, particularly in China. It is valued for its short growth cycle, high nutritional content, and significant contribution to local diets. Typically, bok choy reaches maturity within 25 to 35 d after planting. In this study, bok choy plants at the 2- to 4-true leaf stage, with an average height of approximately 5 to 10 cm, were selected for image acquisition. Once bok choy was accurately detected, the remaining green vegetation in the background was identified as weeds. Image processing techniques were then employed to segment weeds from the background, with area filtering applied to eliminate potential random noise. The original images were subsequently divided into grid cells, and cells containing weeds were labeled in red to create a weed mapping system. Finally, a path planning algorithm was implemented to guide the mechanical actuators along the most efficient and shortest path for operation. The entire procedure is illustrated in Figure 1.

Figure 1. The workflow illustrating the detection and mapping process for bok choy (Brassica rapa ssp. chinensis) using the improved vegetable detection (IVD) model. Target vegetables are first identified, and the remaining green vegetation is segmented as weeds through image processing and area filtering. The processed images are divided into grid cells, with weed-containing cells marked in red to generate a distribution map. A path planning algorithm is then applied to optimize the route for weed control operations.

Image Acquisition

The images of bok choy and weeds were captured from multiple vegetable fields located in Jiangning District (approximately 31.95°N, 118.90°E) and Qixia District (approximately 32.15°N, 118.95°E) of Nanjing, Jiangsu Province, China, during May and October 2022. These fields were selected to represent diverse planting conditions and growth stages. Images were taken by a digital camera (HV1300FC, DaHeng Image, Beijing, China) with an aspect ratio of 4:3 and a resolution of 1,792 × 1,344 pixels. The camera was positioned approximately 0.6 m above the vegetable ground, operating in automatic mode for focus, exposure, and white balance settings. To ensure the diversity of the training dataset, images were collected under various lighting conditions, such as sunny, cloudy, and partly cloudy.

Training and Testing

A total of 1,500 images were annotated using the LabelImg (https://github.com/HumanSignal/labelImg) software. Rectangular bounding boxes were drawn around bok choy to generate corresponding XML label files for the dataset. The annotated images were then divided into training, validation, and testing datasets comprising 1,200 images (80%), 150 images (10%), and 150 images (10%), respectively.

Improved Vegetable Detector

The IVD network was developed by enhancing the YOLOv8 architecture. As a leading example of one-stage deep learning frameworks, YOLO architectures are widely used in real-time object detection due to their exceptional efficiency and precision (Terven et al. Reference Terven, Córdova-Esparza and Romero-González2023). YOLOv8 introduces significant advancements, making it versatile for instance segmentation, key point detection, object detection, and classification tasks (Kashyap Reference Kashyap2024).

In the YOLO architecture, the backbone is responsible for extracting key features from input images, while the neck aggregates and refines these features before passing them to the detection head (Deng et al. Reference Deng, Miao, Zhao, Yang, Gao, Zhai and Zhao2025). A slim-neck design further improves computational efficiency while preserving essential feature information. Optimizing these components is critical for enhancing both detection accuracy and speed, which are essential for real-time weed detection in agricultural environments.

Although YOLOv8 performs well in general object detection tasks, its feature extraction and detection speed require further optimization for bok choy detection, particularly to distinguish fine-grained features within cluttered field environments. To address these challenges, a novel vegetable detection network was developed with two key improvements:

  1. 1. The previous feature fusion layer was modified with a slimmed neck (slim-neck) module in the neck layer.

  2. 2. In the backbone layer, Attention Mechanism and FasterNet were referred to with the convolution to fully connected (C2f) layer replaced by the C2f-Faster-EMA module.

YOLOv8-C2f-Faster-EMA

The YOLOv8-C2f-Faster-EMA network is an enhancement of the YOLOv8 deep learning architecture (Zhu et al. Reference Zhu, Hu, Zheng, Zhou, Ge and Hong2024), and two principal items for improvement were introduced:

  1. 1. Efficient multi-scale attention (EMA): This component integrates multiscale feature fusion and attention mechanisms to enhance the network’s identification capabilities.

  2. 2. Faster Block of FasterNet: invested the Faster Block of FasterNet, which employs parallel processing, into the neck of YOLOv8 to improve the detection precision.

Figure 2 describes the optimized architecture of the network. In this research, the conception of C2f-Faster-EMA was adopted at the backbone stage based on the primitive YOLOv8 network, substituted for the original C2f module. This enhanced architecture is referred to as YOLOv8-C2f-Faster-EMA.

Figure 2. The architecture of YOLOv8-C2f-Faster-EMA. The original convolution to fully connected (C2f) modules are replaced with C2f-Faster-EMA modules to improve feature extraction and computational efficiency. Additionally, in the backbone network, the bottleneck operators in the C2f modules at stages 3, 5, 7, and 9 were hierarchically substituted with the proposed C2f-Faster-EMA units to enhance feature extraction and information flow. SPPF in the model is the abbreviation of Spatial Pyramid Pooling Fast, which is a module used for pooling operations at different scales.

Slim-Neck

The neck of a network is regularly configured between the head and backbone, serving to enhance the expressive potential of features and deliver more impactful feature information to the head part for image classification and object detection. The slim-neck module was designed to modify the neck of the network for greater efficiency. The depthwise separable convolutions (DSC) were introduced to alleviate the high computational cost associated with large-scale processing. However, this approach comes with a trade-off, leading to reduced effectiveness in feature extraction and fusion compared with standard convolutions (SC). The fusion of the SC, DSC, and the shuffle strategy, named group shuffle convolution (GSConv), was tactfully devised, uniformly exchanging local features between different channels by utilizing the shuffle convolution to transfuse information generated by the SC into DSC (Chollet Reference Chollet2017). Therefore, it is recommended that the slimmed neck be combined with the general backbone. The architecture of GSConv is illustrated in Figure 3.

Figure 3. Architecture of the group shuffle convolution (GSConv) module. The standard convolution operators in the neck module were systematically replaced with GSConv units, which are specifically designed to enhance cross-level feature fusion through a lightweight channel-spatial attention mechanism.

While the computational cost was reduced by 50% or more compared with SC, the model’s learning ability remains limited. To further enhance performance, a single-stage aggregation module based on VoVNet (VoV-GSCSP) was used to replace the neck of the model, with GSbottleneck introduced into GSConv. The slimmed neck design significantly improves inference efficiency.

The IVD network was meticulously designed with an optimized neck and backbone, implementing a targeted design based on the primary YOLOv8 architecture. The whole flowchart of this architecture is presented in Figure 4.

Figure 4. Overall architecture of the improved vegetable detection (IVD) model. The group shuffle convolution (GSConv) units were introduced for Slim-neck construction, and VoV-GSCSP modules were integrated into the You-Only-Look-Once-v8 (YOLOv8) framework. During inference, multiscale feature maps undergo channel compression via GSConv, followed by bilinear upsampling and concatenation to establish cross-resolution connections. These features are further refined through secondary GSConv filtering and final consolidation via a single-stage aggregation module based on VoVNet Volumetric Grid Spatial Cross Stage Partial (VoV-GSCSP) fusion gates. In the backbone, computational redundancy is reduced by replacing conventional bottlenecks in the convolution to fully connected (C2f) modules with Faster-EMA blocks, which apply the efficient multiscale attention (EMA) mechanisms to enhance salient spatial-frequency feature extraction.

Experiment Setup

The training and testing platform was the PyTorch v. 1.8.1 deep learning environment (https://pytorch.org; Facebook, San Jose, CA, USA) with the GPU of NVIDIA (GeForce RTX 2080 Ti). Transfer learning is usually employed to apply the knowledge gained from data in related fields to address novel, yet analogous challenges in the present domain (Weiss et al. Reference Weiss, Khoshgoftaar and Wang2016). In this research, the IVD network was pretrained on ImageNet, a large-scale dataset with more than 14 million labeled images (Deng et al. Reference Deng, Dong, Socher, Li, Li and Fei-Fei2009). During training, all layers of the network were fine-tuned on the bok choy detection dataset without freezing any backbone or neck parameters, allowing full adaptation of feature representations to the target domain. The following hyperparameters were used, in accordance with YOLOv8 default settings: a batch size of 16, momentum of 0.937, an initial learning rate of 0.01, Stochastic Gradient Descent (SGD) as the optimizer, a weight decay of 0.0005, and a training duration of 100 epochs.

Evaluation Metrics

Accuracy and efficiency are crucial for real-time applications. This research employed precision, recall, mean average precision (mAP), and giga floating-point operations per second (GFLOPS) as metrics to evaluate the model’s performance.

The network’s training and testing results were organized into a binary confusion matrix with four outcomes: true positive (TP), false positive (FP), true negative (TN), and false negative (FN) (Baldi et al. Reference Baldi, Brunak, Chauvin, Andersen and Nielsen2000).

Precision represents the ratio of correctly predicted positive instances to the total number of instances predicted as positive by the model (Prati et al. Reference Prati, Batista and Monard2011; Sokolova and Lapalme Reference Sokolova and Lapalme2009). It was calculated as:

([1]) $${\rm{Precision}} = {\rm{\;}}{{{{\rm{TP}}}}\over{{{\rm{TP}} + {\rm{FP}}}}}$$

Recall represents the proportion of correctly predicted positive instances out of all actual positive instances (Grandini et al. Reference Grandini, Bagli and Visani2020). It was calculated as:

([2]) $${\rm{Recall}} = {\rm{\;}}{{{{\rm{TP}}}} \over {{{\rm{TP}} + {\rm{FN}}}}}$$

Intersection over union (IoU) measures the ratio of the overlap between the predicted bounding box and the actual bounding box. A higher IoU indicates a more accurate prediction. It was calculated as:

([3]) $${\rm{IoU}} = {\rm{\;}}{{{{\rm{Area\;of\;overlap}}}} \over {{{\rm{Area\;of\;union}}}}}$$

While precision and recall represent distinct evaluation criteria, average precision (AP) provides a comprehensive index that considers both metrics (Everingham et al. Reference Everingham, Eslami, Van Gool, Williams, Winn and Zisserman2015). It was calculated as:

([4]) $${\rm{AP}} = {\rm{\;}}\mathop \int \nolimits_0^1 {\rm{p}}\left( {\rm{R}} \right){\rm{dR}}$$

where ${\rm{p}}\left( {\rm{R}} \right){\rm{\;}}$ is the precision-recall curve, with precision plotted on the vertical axis and recall on the horizontal axis. mAP, a commonly used metric in object detection, is the average AP value across all categories. It was calculated as:

([5]) $${\rm{mAP}} = {\rm{\;}}{{{\mathop \sum \nolimits_{i = 1}^{N} {\rm{APi}}}} \over {N}}$$

The values of mAP50 and mAP50-95 are commonly utilized as the evaluation metrics of the detection performance. The mAp50 value is defined as the value of mAP when the threshold of IoU is set to 50%, while mAP50-95 value is the average value of mAP when the IoU threshold varies from 50% to 95%. It is obvious that mAP50-95 is a more precise metric, as it considers multiple IoU thresholds.

GFLOPs is a metric that quantifies the computational resources required by a processor during the inference period. Smaller GFLOPs values indicate lower computational demands and faster inference. GFLOPs is a standard metric for evaluating the efficiency of YOLO networks.

Image Processing

Both vegetables and weeds are green, while the soil has a distinct color. Once the network detects the vegetables, the remaining pixels in the background are weeds, straw, or soil. Vegetable pixels are removed first, and the remaining green vegetation in the background is identified as weeds.

The excess green (ExG) index (Morid et al. Reference Morid, Borjali and Del Fiol2021), previously explored for weed identification (Jin et al. Reference Jin, Sun, Che, Bagavathiannan, Yu and Chen2022c; Sun et al. Reference Sun, Liu, Wang, Zhai and Yu2024), was optimized in this research to enhance weed segmentation performance. The modified ExG index is defined as:

([6]) ${\rm{ExG}} = {\rm{\;}}\left\{ {\matrix{ {0,\;} & {{\rm if}(g \lt r\;or\;g \lt b)} \cr {1.875*} & {\rm g - r - b,\;otherwise} \cr } } \right.$

To reduce sensitivity to varying illumination, the modified ExG index uses normalized RGB values:

([7]) $${\rm{r}} = {\rm{\;}}{{{\rm{R}}}\over {{{\rm{R}} + {\rm{G}} + {\rm{B}}}}},{\rm{\;\;g}} = {\rm{\;}}{{{\rm{G}}}\over {{{\rm{R}} + {\rm{G}} + {\rm{B}}}}},{\rm{\;\;b}} = {\rm{\;}}{{{\rm{B}}} \over {{{\rm{R}} + {\rm{G}} + {\rm{B}}}}}$$

The Otsu method (Otsu Reference Otsu1975) was applied to convert grayscale images into binary images. This was followed by area filtering to eliminate random noise in the background. As a result, weeds were effectively segmented from the original images.

Weed Mapping

A custom program was developed to divide the original images (1,792 × 1,344 pixels) into 48 equal grid cells measuring 224 × 224 pixels, arranged in 6 rows and 8 columns. Once the positions of the weeds were determined, the corresponding grid cell(s) were labeled as weeding area(s), and a weed map was generated.

For a weeding system equipped with a mechanical weeding machine, each grid cell represents a unit of weeding area, facilitating the integration of weed detection results with field application. In actual applications, the size of each grid cell should be equal to or slightly smaller than the mechanical actuator’s footprint. This configuration ensures that the mechanical actuators are directed only toward grid cells marked as weed infested, thereby achieving precise and efficient weeding.

Path Planning

As the weed mapping was constructed, path planning algorithms were elaborately designed to guide the mechanical actuators to cross over the grid cells to ensure the optimum route for real-time weeding. The performances of three path planning algorithms were compared and analyzed, including the Christofides algorithm (Papadimitriou and Vazirani Reference Papadimitriou and Vazirani1984), the Dijkstra algorithm (Xu et al. Reference Xu, Liu, Huang, Zhang and Luan2007), and DP (Bellman Reference Bellman1954).

  1. 1. The Christofides algorithm is an approximate algorithm for the traveling salesman problem on a metric space that is distance symmetric and satisfies the triangle inequality. It strikes a delicate balance between resolution quality and computational time.

  2. 2. The Dijkstra algorithm is targeted on the shortest path of weighted graphs by computing the nearest way between two points. The process will finally be terminated when all of the points have been visited.

  3. 3. DP is a method to solve the optimization problem of a multistage decision-making process, and the key point is to disassemble the entire problem into smaller ones, storing what has been computed in the procedure to reduce the computation cost (Bellman Reference Bellman1954).

For field application, the mechanical actuators are aligned with the grid cells and follow the optimal path determined by the selected path planning algorithm. To assess the performance of the path planning algorithms, execution time and the length of the planning path (measured by pixels) were analyzed and compared.

Results and Discussion

Vegetable Detection

The ablation experiment was adopted to validate the efficiency of the IVD network. The C2f-Faster-EMA module, the slim-neck module, and the complete network were evaluated against the baseline YOLOv8 network. The results of the ablation experiment are summarized in Table 1. When only the C2f-Faster-EMA module was implemented at the backbone stage to replace the original C2f module, precision increased by 0.9%, and computational costs were significantly reduced. The number of parameters, GFLOPS, and model size decreased by 23.2%, 19.7%, and 22.2%, respectively. These results demonstrate that the C2f-Faster-EMA module significantly improved computational efficiency. However, there was a slight reduction in the mAP50-95 and recall values, which decreased by 0.5% and 1.9%, respectively. This reduction can be attributed to the simplified feature extraction inherent in the lightweight backbone design. Nevertheless, given the increased precision and substantial efficiency gains, this trade-off remains acceptable for real-time field applications with limited computational resources.

Table 1. Ablation study results evaluating the impact of C2f-Faster-EMA and Slim-neck modules on detection performance and model complexity.a

a mAP50, the mean average precision at 0.5; mAP50-95, the mean average precision varies from 50% to 95%; GFLOPS, giga floating-point operations per second; YOLOv8, You-Only-Look-Once-v8; G, Giga; M, Megabyte.

The results showed that the slim-neck module, designed to achieve lightweight optimization while enhancing computational performance, also demonstrated reductions in parameters, GFLOPS, and model size. Notably, the mAP50-95 value was maintained, further validating the module’s efficiency. When both the C2f-Faster-EMA and slim-neck modules were integrated into the YOLOv8 network, a well-balanced outcome was achieved. The mAP50 value was preserved, while computational costs were effectively reduced, highlighting the synergy of these modules in improving performance.

Figure 5 illustrates the performance of the IVD model in vegetable detection under complex field conditions, including cluttered backgrounds, dense weed–vegetable overlap, and strong illumination. The model demonstrated accurate localization, high precision, and strong robustness across these challenging scenarios, confirming its suitability for real-world deployment. These qualitative results are complemented by the training performance shown in Figure 6, where the IVD model exhibits a steeper loss curve with faster convergence compared with YOLOv8, indicating more efficient optimization during training.

Figure 5. Detection results of the improved vegetable detection (IVD) model on vegetables under challenging conditions, including complex backgrounds and dense weed–vegetable clusters.

Figure 6. Training loss curve of the improved vegetable detection (IVD) model over 100 epochs. The IVD model exhibits a steeper loss curve with faster convergence compared with You-Only-Look-Once-v8 (YOLOv8), indicating more efficient optimization during training.

To clearly illustrate the processing results at each stage, the original images, along with those processed through DCNN detection, image processing, and weed mapping, are presented in Figure 7 for comparison. The images in the first row represent the original images, while those in the second row display the detection results from the IVD network, with each detected vegetable framed within a bounding box. Pixels within these bounding boxes represent vegetables and were removed, allowing the remaining green vegetation to be identified as weeds. The subsequent step involved segmentation, performed using image processing techniques, including the ExG index and area filtering algorithm, to isolate weeds from the background. The third row in Figure 7 depicts the preprocessing stage for segmentation, while the fourth row displays the segmentation approach. Weeds within vegetable crops were indirectly identified through the integration of DCNNs and image processing methods.

Figure 7. Weed mapping workflow from original images to trajectory planning. The first row shows the original images of vegetable fields. The second row displays the detection results from the improved vegetable detection (IVD) network, with vegetables highlighted by bounding boxes. The third row presents binary segmentation images generated through excess green (ExG)-based vegetation enhancement followed by Otsu thresholding. The fourth row shows the results after vegetable removal and area filtering to isolate true weed regions. The fifth row displays the generated weeding trajectories used to guide precision weed control operations.

Weed Mapping

A precise weed map was established based on the weed detection results. The original images were divided into smaller, equally sized grid cells. Cells containing weeds were marked in red, representing the designated weeding areas, while the remaining grid cells were identified as requiring no weeding. The weed mapping results are displayed in the fifth row of Figure 7. With the weeding regions clearly highlighted, this approach enhances the feasibility of practical weeding applications.

Path Planning

The path planning strategy was executed based on weed mapping results. Three path planning algorithms were carefully designed and tested for comparison and analysis. The path planning results of the four previously cited images are shown in Figure 8, while the evaluation metrics for efficiency and effectiveness are depicted in Table 2. The blue line in Figure 8 represents the weeding trajectory for a smart machine. The Dijkstra algorithm exhibited a significant advantage in computation efficiency in this experiment. For the four given images, the Dijkstra algorithm consistently produced the shortest path and required the least computation time for the weeding operation. In contrast, the Christofides algorithm performed poorly, with longer computation times and path lengths. Notably, for the third image (Figure 8C), the Christofides algorithm took 13 times longer to compute and required 216 more pixels for the weeding path compared with the Dijkstra algorithm. It is worth noting that DP showed inconsistent performance; while it required less time than the Christofides algorithm for images in Figure 8A and 8B, it took relatively more time for images in Figure 8C and 8D. In general, the Dijkstra algorithm performed exceptionally well in terms of both computing efficiency and optimal path planning.

Figure 8. Path planning results for precision weeding based on weed mapping. The blue lines represent the optimized weeding trajectories generated by different path planning algorithms (Christofides, Dijkstra, and dynamic programming [DP]) across four sample images. These results illustrate the application of trajectory optimization for efficient weed control operations.

Table 2. Performance comparison of three path planning algorithms on four sample weed maps, with execution time and shortest path length (in pixels) reported for each algorithm across four images labeled A, B, C, and D

a DP, dynamic programming.

Direct detection of different weed species, morphologies, densities, and growth stages is a challenging task, as it requires labeling a large volume of weed image data, which is both labor-intensive and time-consuming (Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019c). Additionally, collecting and labeling weed datasets is tedious, and such datasets are often nontransferable across different crops. This study proposes an efficient deep learning network based on YOLOv8 trained to detect vegetables instead of weeds. By focusing on vegetables, the approach bypasses the complexities associated with managing diverse weed characteristics.

With rising living standards, there is increasing demand for green, organic vegetables, which are grown without the use of synthetic herbicides (Rahman et al. Reference Rahman, Mele, Lee and Islam2021; Reganold and Wachter Reference Reganold and Wachter2016). In this context, smart mechanical weeding machines equipped with accurate weed detection systems offer an ideal solution for performing weeding tasks in organic vegetable crops. Effective weed detection systems aim to eliminate weeds while avoiding damage to crops. The proposed method achieved this by accurately detecting vegetable crops and excluding them from the weeding process, ensuring precision in weed control.

The YOLO series of deep learning architectures is widely recognized for its efficiency in object detection and adaptability to diverse tasks (Badgujar et al. Reference Badgujar, Poulose and Gan2024). The IVD was developed based on the YOLOv8 architecture, with enhancements such as the C2f-Faster-EMA module in the backbone stage and an improved feature fusion with a slim-neck at the neck stage. Ablation experiment results showed reduced computation costs, with parameters reduced by 0.57 million, model size by 1.1 MB, and GFLOPs by 1.8 compared with the original YOLOv8 network. This optimization makes the network more lightweight while maintaining excellent detection precision, making it highly suitable for real-time weeding applications.

Some bounding boxes generated by the trained network were observed to partially or completely overlap, as clearly illustrated in row 3 of Figure 7. This overlap often occurs when vegetables are closely spaced, potentially reducing the recall value. However, this issue has minimal impact on final weed detection, because the vegetables are accurately identified within the bounding boxes and excluded before weed segmentation through image processing methods.

The attention mechanism is commonly employed to enhance the processing of sequential data (Hassanin et al. Reference Hassanin, Anwar, Radwan, Khan and Mian2024). EMA, a novel and highly efficient attention mechanism, captures both channel and spatial information simultaneously, improving feature representation without increasing computational costs (Marsella and Gratch Reference Marsella and Gratch2009). FasterNet is recognized for its high processing speed, owing to its use of partial convolution to reduce redundant computations and memory access (Chen et al. Reference Chen, Kao, He, Zhuo, Wen, Lee and Chan2023). When EMA and the Faster Block of FasterNet are combined, overall efficiency is significantly boosted. This improvement was clearly demonstrated in the ablation experiment, where only the C2F-Faster-EMA module was integrated.

Extensive research has been conducted on detecting weeds across various crop categories, achieving outstanding detection accuracy and significantly advancing the development of precision agriculture (Peng et al. Reference Peng, Li, Zhou and Shao2022; Wang et al. Reference Wang, Zhang and Wei2019; Yu et al. Reference Yu, Sharpe, Schumann and Boyd2019b). To further utilize the detection results, weed mapping was constructed after detecting vegetables, followed by weed segmentation through image processing. The original images were systematically divided into grid cells, with only those containing weeds marked as weeding areas. The size of the grid cells can be tailored to the operational area of weeding actuators using weed mapping. This adaptability is crucial, as the size of weeding actuators can vary, thereby enhancing the applicability and efficiency of weeding applications.

Path planning algorithms were integrated with weed mapping to guide the mechanical actuators exclusively to the grid cells containing weeds. In this study, three path planning algorithms were evaluated, with the Dijkstra algorithm emerging as the most effective by balancing computational costs with the shortest path length. Interestingly, the performance of the DP algorithm varied across different images in terms of time consumption, likely due to its memory allocation requirements, which warrants further investigation. In contrast, the Christofides algorithm consistently generated longer paths and required more computation time than the other two algorithms. As a heuristic method based on the Hamiltonian circuit, the Christofides algorithm provides an approximate solution that, while not optimal, ensures that the loop length never exceeds 1.5 times the optimal length, even in the worst-case scenario.

In this study, path planning was creatively applied to vegetable weeding, enabling precise machine-guided weed control. These algorithms, based on weed mapping, can also be adapted for other precision weeding applications. For instance, a smart sprayer can be integrated with path planning algorithms to accurately and efficiently apply herbicides only to the grid cells containing weeds. Further investigation is required to assess the feasibility of integrating path planning and weed mapping for weed control in other cropping systems.

This research proposed an innovative system integrating weed detection, weed mapping, and path planning into a unified approach for precise weeding. Weed detection was performed indirectly by first identifying vegetables through the IVD, with the remaining green vegetation classified as weeds. The IVD demonstrated significant improvements in both precision and efficiency, achieving a 0.2 increase in mAP50 while reducing parameters, GFLOPS, and model size compared with the original YOLOv8 network. Weed mapping serves as a bridge between weed detection and precise weeding applications, effectively defining operational areas for targeted weed control. Among the three path planning algorithms evaluated, the Dijkstra algorithm emerged as the most efficient, offering the shortest weeding path with optimal computational efficiency. This proposed method provides a robust solution for precise weeding and introduces a novel approach with significant potential for broader applications in weed management.

Funding statement

This research is supported by the Weifang Science and Technology Development Plan Project (grant no. 2024ZJ1097), the Key R&D Program of Shandong Province, China (ZR202211070163), the Taishan Scholar Program, and the National Natural Science Foundation of China (grant no. 32072498).

Competing interests

The authors declare no conflicts of interest.

Footnotes

Associate Editor: Nathan S. Boyd, Gulf Coast Research and Education Center

References

Badgujar, CM, Poulose, A, Gan, H (2024) Agricultural object detection with You Only Look Once (YOLO) Algorithm: a bibliometric and systematic literature review. Comput Electron Agric 223:109090 10.1016/j.compag.2024.109090CrossRefGoogle Scholar
Bakhshipour, A, Jafari, A, Nassiri, SM, Zare, D (2017) Weed segmentation using texture features extracted from wavelet sub-images. Biosyst Eng 157:12 10.1016/j.biosystemseng.2017.02.002CrossRefGoogle Scholar
Baldi, P, Brunak, S, Chauvin, Y, Andersen, CA, Nielsen, H (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16:412424 10.1093/bioinformatics/16.5.412CrossRefGoogle ScholarPubMed
Bellman, R (1954) The theory of dynamic programming. Bull Am Math Soc 60:503515 CrossRefGoogle Scholar
Berge, TW, Aastveit, AH, Fykse, H (2008) Evaluation of an algorithm for automatic detection of broad-leaved weeds in spring cereals. Precis Agric 9:391405 10.1007/s11119-008-9083-zCrossRefGoogle Scholar
Chen, J, Kao, SH, He, H, Zhuo, W, Wen, S, Lee, CH, Chan, SH (2023) Run, don’t walk: chasing higher FLOPS for faster neural networks. Pages 1202112031 in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023. Vancouver, OR: Institute of Electrical and Electronic Engineers (IEEE)Google Scholar
Chen, X, Liu, T, Han, K, Jin, X, Wang, J, Kong, X, Yu, J (2024) TSP-yolo-based deep learning method for monitoring cabbage seedling emergence. Eur J Agron 157:127191 10.1016/j.eja.2024.127191CrossRefGoogle Scholar
Chollet, F (2017) Xception: deep learning with depthwise separable convolutions. Pages 12511258 in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017. Honolulu, OR: Institute of Electrical and Electronic Engineers (IEEE)Google Scholar
Chung, CL, Huang, KJ, Chen, SY, Lai, MH, Chen, YC, Kuo, YF (2016) Detecting Bakanae disease in rice seedlings by machine vision. Comput Electron Agric 121:404411 10.1016/j.compag.2016.01.008CrossRefGoogle Scholar
Deng, J, Dong, W, Socher, R, Li, LJ, Li, K, Fei-Fei, L (2009) Imagenet: a large-scale hierarchical image database. Pages 248255 in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2009. Miami, OR: Institute of Electrical and Electronic Engineers (IEEE)10.1109/CVPR.2009.5206848CrossRefGoogle Scholar
Deng, L, Miao, Z, Zhao, X, Yang, S, Gao, Y, Zhai, C, Zhao, C (2025) HAD-YOLO: an accurate and effective weed detection model based on improved YOLOV5 network. Agronomy 15:57 10.3390/agronomy15010057CrossRefGoogle Scholar
Dong, J, Gruda, N, Li, X, Cai, Z, Zhang, L, Duan, Z (2022) Global vegetable supply towards sustainable food production and a healthy diet. J Clean Prod 369:133212 10.1016/j.jclepro.2022.133212CrossRefGoogle Scholar
Dyrmann, M, Karstoft, H, Midtiby, HS (2016) Plant species classification using deep convolutional neural network. Biosyst Eng 151:7280 10.1016/j.biosystemseng.2016.08.024CrossRefGoogle Scholar
Everingham, M, Eslami, SA, Van Gool, L, Williams, CK, Winn, J, Zisserman, A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vision 111:98136 10.1007/s11263-014-0733-5CrossRefGoogle Scholar
Gerhards, R, Andujar Sanchez, D, Hamouz, P, Peteinatos, GG, Christensen, S, Fernandez-Quintanilla, C (2022) Advances in site-specific weed management in agriculture—a review. Weed Res 62:123133 10.1111/wre.12526CrossRefGoogle Scholar
Grandini, M, Bagli, E, Visani, G (2020) Metrics for multi-class classification: an overview. arXiv:2008.05756 Google Scholar
Grigorescu, S, Trasnea, B, Cocias, T, Macesanu, G (2020) A survey of deep learning techniques for autonomous driving. J Field Robot 37:362386 10.1002/rob.21918CrossRefGoogle Scholar
Grinblat, GL, Uzal, LC, Larese, MG, Granitto, PM (2016) Deep learning for plant identification using vein morphological patterns. Comput Electron Agric 127:418424 10.1016/j.compag.2016.07.003CrossRefGoogle Scholar
Hamuda, E, Glavin, M, Jones, E (2016) A survey of image processing techniques for plant extraction and segmentation in the field. Comput Electron Agric 125:184199 10.1016/j.compag.2016.04.024CrossRefGoogle Scholar
Hasan, AM, Sohel, F, Diepeveen, D, Laga, H, Jones, MG (2021) A survey of deep learning techniques for weed detection from images. Comput Electron Agric 184:106067 10.1016/j.compag.2021.106067CrossRefGoogle Scholar
Hassanin, M, Anwar, S, Radwan, I, Khan, FS, Mian, A (2024) Visual attention methods in deep learning: an in-depth survey. Inform Fusion 108:102417 10.1016/j.inffus.2024.102417CrossRefGoogle Scholar
Jin, X, Bagavathiannan, M, Maity, A, Chen, Y, Yu, J (2022a) Deep learning for detecting herbicide weed control spectrum in turfgrass. Plant Methods 18:94 10.1186/s13007-022-00929-4CrossRefGoogle ScholarPubMed
Jin, X, Bagavathiannan, M, McCullough, PE, Chen, Y, Yu, J (2022b) A deep learning-based method for classification, detection, and localization of weeds in turfgrass. Pest Manag Sci 78:48094821 10.1002/ps.7102CrossRefGoogle ScholarPubMed
Jin, X, Liu, T, Yang, Z, Xie, J, Bagavathiannan, M, Hong, X, Xu, Z, Chen, X, Yu, J, Chen, Y (2023) Precision weed control using a smart sprayer in dormant bermudagrass turf. Crop Prot 172:106302 10.1016/j.cropro.2023.106302CrossRefGoogle Scholar
Jin, X, Sun, Y, Che, J, Bagavathiannan, M, Yu, J, Chen, Y (2022c) A novel deep learning-based method for detection of weeds in vegetables. Pest Manag Sci 78:18611869 10.1002/ps.6804CrossRefGoogle ScholarPubMed
Jordan, MI, Mitchell, TM (2015) Machine learning: trends, perspectives, and prospects. Science 349:255260 10.1126/science.aaa8415CrossRefGoogle ScholarPubMed
Kashyap, A (2024) A novel method for real-time object-based copy-move tampering localization in videos using fine-tuned YOLO V8. Forensic Sci Int 48:301663 Google Scholar
Kong, X, Li, A, Liu, T, Han, K, Jin, X, Chen, X, Yu, J (2024) Lightweight cabbage segmentation network and improved weed detection method. Comput Electron Agric 226:109403 10.1016/j.compag.2024.109403CrossRefGoogle Scholar
Kumar, D, Kumar, S, Shekhar, C (2020) Nutritional components in green leafy vegetables: a review. J Pharmacogn Phytochem 9:24982502 Google Scholar
LeCun, Y, Bengio, Y, Hinton, G (2015) Deep learning. Nature 521:436444 10.1038/nature14539CrossRefGoogle ScholarPubMed
Liakos, KG, Busato, P, Moshou, D, Pearson, S, Bochtis, D (2018) Machine learning in agriculture: a review. Sensors 18:2674 10.3390/s18082674CrossRefGoogle ScholarPubMed
Liu, J, Abbas, I, Noor, RS (2021) Development of deep learning-based variable rate agrochemical spraying system for targeted weeds control in strawberry crop. Agronomy 11:1480 10.3390/agronomy11081480CrossRefGoogle Scholar
Mahesh, B (2020) Machine learning algorithms—a review. Int J Sci Res 9:381386 Google Scholar
Marsella, SC, Gratch, J (2009) EMA: a process model of appraisal dynamics. Cogn Syst Res 10:7090 10.1016/j.cogsys.2008.03.005CrossRefGoogle Scholar
Memon, MS, Chen, S, Shen, B, Liang, R, Tang, Z, Wang, S, Zhou, W, Memon, N (2025) Automatic visual recognition, detection and classification of weeds in cotton fields based on machine vision. Crop Prot 187:106966 10.1016/j.cropro.2024.106966CrossRefGoogle Scholar
Modi, RU, Kancheti, M, Subeesh, A, Raj, C, Singh, AK, Chandel, NS, Dhimate, AS, Singh, MK, Singh, S (2023) An automated weed identification framework for sugarcane crop: a deep learning approach. Crop Prot 173:106360 10.1016/j.cropro.2023.106360CrossRefGoogle Scholar
Morid, MA, Borjali, A, Del Fiol, G (2021) A scoping review of transfer learning research on medical image analysis using ImageNet. Comput Biol Med 128:104115 10.1016/j.compbiomed.2020.104115CrossRefGoogle ScholarPubMed
Otsu, N (1975) A threshold selection method from gray-level histograms. Automatica 11:2327 Google Scholar
Otter, DW, Medina, JR, Kalita, JK (2020) A survey of the usages of deep learning for natural language processing. IEEE T Neur Net Lear 32:604624 Google Scholar
Pak, M, Kim, S (2017) A review of deep learning in image recognition. Pages 13 in Proceedings of 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT). Kuta Bali, Indonesia, OR: Institute of Electrical and Electronic Engineers (IEEE)Google Scholar
Pantazi, XE, Moshou, D, Bravo, C (2016) Active learning system for weed species recognition based on hyperspectral sensing. Biosyst Eng 146:193202 10.1016/j.biosystemseng.2016.01.014CrossRefGoogle Scholar
Papadimitriou, CH, Vazirani, UV (1984) On two geometric problems related to the travelling salesman problem. J Algorithm 5:231246 10.1016/0196-6774(84)90029-4CrossRefGoogle Scholar
Pei, H, Sun, Y, Huang, H, Zhang, W, Sheng, J, Zhang, Z (2022) Weed detection in maize fields by UAV images based on crop row preprocessing and improved YOLOv4. Agriculture 12:975 10.3390/agriculture12070975CrossRefGoogle Scholar
Peng, H, Li, Z, Zhou, Z, Shao, Y (2022) Weed detection in paddy field using an improved RetinaNet network. Comput Electron Agr 199:107179 10.1016/j.compag.2022.107179CrossRefGoogle Scholar
Perez, AJ, Lopez, F, Benlloch, JV, Christensen, S (2020) Colour and shape analysis techniques for weed detection in cereal fields. Comput Electron Agric 25:197212 10.1016/S0168-1699(99)00068-XCrossRefGoogle Scholar
Prati, RC, Batista, GE, Monard, MC (2011) A survey on graphical methods for classification predictive performance evaluation. IEEE Trans Knowl Data En 23:16011618 10.1109/TKDE.2011.59CrossRefGoogle Scholar
Rahman, SM, Mele, MA, Lee, YT, Islam, MZ (2021) Consumer preference, quality, and safety of organic and conventional fresh fruits, vegetables, and cereals. Foods 10:105 10.3390/foods10010105CrossRefGoogle ScholarPubMed
Rai, N, Zhang, Y, Ram, BG, Schumacher, L, Yellavajjala, RK, Bajwa, S, Sun, X (2023) Applications of deep learning in precision weed management: a review. Comput Electron Agric 206:107698 10.1016/j.compag.2023.107698CrossRefGoogle Scholar
Reganold, JP, Wachter, JM (2016) Organic agriculture in the twenty-first century. Nat Plants 2:18 10.1038/nplants.2015.221CrossRefGoogle ScholarPubMed
Sengupta, S, Lee, WS (2014) Identification and determination of the number of immature green citrus fruit in a canopy under different ambient light conditions. Biosyst Eng 117:5161 10.1016/j.biosystemseng.2013.07.007CrossRefGoogle Scholar
Slaughter, DC, Giles, DK, Downey, D (2008) Autonomous robotic weed control systems: a review. Comput Electron Agric 61:6378 10.1016/j.compag.2007.05.008CrossRefGoogle Scholar
Sokolova, M, Lapalme, G (2009) A systematic analysis of performance measures for classification tasks. Inform Process Manag 45:427437 10.1016/j.ipm.2009.03.002CrossRefGoogle Scholar
Sun, H, Liu, T, Wang, J, Zhai, D, Yu, J (2024) Evaluation of two deep learning-based approaches for detecting weeds growing in cabbage. Pest Manag Sci 80:28172826 10.1002/ps.7990CrossRefGoogle ScholarPubMed
Tao, T, Wei, X (2024) STBNA-YOLOv5: an improved YOLOv5 network for weed detection in rapeseed field. Agriculture 15:22 10.3390/agriculture15010022CrossRefGoogle Scholar
Terven, J, Córdova-Esparza, DM, Romero-González, JA (2023) A comprehensive review of YOLO architectures in computer vision: from YOLOv1 to YOLOv8 and YOLO-NAS. Mach Learn Know Extr 5:16801716 Google Scholar
Wang, A, Zhang, W, Wei, X (2019) A review on weed detection using ground-based machine vision and image processing techniques. Comput Electron Agric 158:226240 10.1016/j.compag.2019.02.005CrossRefGoogle Scholar
Weiss, K, Khoshgoftaar, TM, Wang, D (2016) A survey of transfer learning. J Big Data 3:140 10.1186/s40537-016-0043-6CrossRefGoogle Scholar
Xu, K, Shu, L, Xie, Q, Song, M, Zhu, Y, Cao, W, Ni, J (2023) Precision weed detection in wheat fields for agriculture 4.0: a survey of enabling technologies, methods, and research challenges. Comput Electron Agric 212:108106 10.1016/j.compag.2023.108106CrossRefGoogle Scholar
Xu, MH, Liu, YQ, Huang, QL, Zhang, YX, Luan, GF (2007) An improved Dijkstra’s shortest path algorithm for sparse network. Appl Math Comput 185:247254 Google Scholar
Yu, J, Schumann, AW, Cao, Z, Sharpe, SM, Boyd, NS (2019a) Weed detection in perennial ryegrass with deep learning convolutional neural network. Front Plant Sci 10:1422 10.3389/fpls.2019.01422CrossRefGoogle ScholarPubMed
Yu, J, Schumann, AW, Sharpe, SM, Li, X, Boyd, NS (2020) Detection of grassy weeds in bermudagrass with deep convolutional neural networks. Weed Sci 68:545552 10.1017/wsc.2020.46CrossRefGoogle Scholar
Yu, J, Sharpe, SM, Schumann, AW, Boyd, NS (2019b) Deep learning for image-based weed detection in turfgrass. Eur J Agron 104:7884 10.1016/j.eja.2019.01.004CrossRefGoogle Scholar
Yu, J, Sharpe, SM, Schumann, AW, Boyd, NS (2019c) Detection of broadleaf weeds growing in turfgrass with convolutional neural networks. Pest Manag Sci 75:22112218 10.1002/ps.5349CrossRefGoogle ScholarPubMed
Zhang, Z, Geiger, J, Pohjalainen, J, Mousa, AE, Jin, W, Schuller, B (2018) Deep learning for environmentally robust speech recognition: an overview of recent developments. ACM Trans Intel Syst Technol 9:128 Google Scholar
Zhu, J, Hu, T, Zheng, L, Zhou, N, Ge, H, Hong, Z (2024) YOLOv8-C2f-Faster-EMA: an improved underwater trash detection model based on YOLOv8. Sensors 24:2483 10.3390/s24082483CrossRefGoogle ScholarPubMed
Zhuang, J, Li, X, Bagavathiannan, M, Jin, X, Yang, J, Meng, W, Li, T, Li, L, Wang, Y, Chen, Y, Yu, J (2022) Evaluation of different deep convolutional neural networks for detection of broadleaf weed seedlings in wheat. Pest Manag Sci 78:521529 10.1002/ps.6656CrossRefGoogle ScholarPubMed
Figure 0

Figure 1. The workflow illustrating the detection and mapping process for bok choy (Brassica rapa ssp. chinensis) using the improved vegetable detection (IVD) model. Target vegetables are first identified, and the remaining green vegetation is segmented as weeds through image processing and area filtering. The processed images are divided into grid cells, with weed-containing cells marked in red to generate a distribution map. A path planning algorithm is then applied to optimize the route for weed control operations.

Figure 1

Figure 2. The architecture of YOLOv8-C2f-Faster-EMA. The original convolution to fully connected (C2f) modules are replaced with C2f-Faster-EMA modules to improve feature extraction and computational efficiency. Additionally, in the backbone network, the bottleneck operators in the C2f modules at stages 3, 5, 7, and 9 were hierarchically substituted with the proposed C2f-Faster-EMA units to enhance feature extraction and information flow. SPPF in the model is the abbreviation of Spatial Pyramid Pooling Fast, which is a module used for pooling operations at different scales.

Figure 2

Figure 3. Architecture of the group shuffle convolution (GSConv) module. The standard convolution operators in the neck module were systematically replaced with GSConv units, which are specifically designed to enhance cross-level feature fusion through a lightweight channel-spatial attention mechanism.

Figure 3

Figure 4. Overall architecture of the improved vegetable detection (IVD) model. The group shuffle convolution (GSConv) units were introduced for Slim-neck construction, and VoV-GSCSP modules were integrated into the You-Only-Look-Once-v8 (YOLOv8) framework. During inference, multiscale feature maps undergo channel compression via GSConv, followed by bilinear upsampling and concatenation to establish cross-resolution connections. These features are further refined through secondary GSConv filtering and final consolidation via a single-stage aggregation module based on VoVNet Volumetric Grid Spatial Cross Stage Partial (VoV-GSCSP) fusion gates. In the backbone, computational redundancy is reduced by replacing conventional bottlenecks in the convolution to fully connected (C2f) modules with Faster-EMA blocks, which apply the efficient multiscale attention (EMA) mechanisms to enhance salient spatial-frequency feature extraction.

Figure 4

Table 1. Ablation study results evaluating the impact of C2f-Faster-EMA and Slim-neck modules on detection performance and model complexity.a

Figure 5

Figure 5. Detection results of the improved vegetable detection (IVD) model on vegetables under challenging conditions, including complex backgrounds and dense weed–vegetable clusters.

Figure 6

Figure 6. Training loss curve of the improved vegetable detection (IVD) model over 100 epochs. The IVD model exhibits a steeper loss curve with faster convergence compared with You-Only-Look-Once-v8 (YOLOv8), indicating more efficient optimization during training.

Figure 7

Figure 7. Weed mapping workflow from original images to trajectory planning. The first row shows the original images of vegetable fields. The second row displays the detection results from the improved vegetable detection (IVD) network, with vegetables highlighted by bounding boxes. The third row presents binary segmentation images generated through excess green (ExG)-based vegetation enhancement followed by Otsu thresholding. The fourth row shows the results after vegetable removal and area filtering to isolate true weed regions. The fifth row displays the generated weeding trajectories used to guide precision weed control operations.

Figure 8

Figure 8. Path planning results for precision weeding based on weed mapping. The blue lines represent the optimized weeding trajectories generated by different path planning algorithms (Christofides, Dijkstra, and dynamic programming [DP]) across four sample images. These results illustrate the application of trajectory optimization for efficient weed control operations.

Figure 9

Table 2. Performance comparison of three path planning algorithms on four sample weed maps, with execution time and shortest path length (in pixels) reported for each algorithm across four images labeled A, B, C, and D