Figure 1:

Figure 2:

Figure 3:

Class-wise F1-scores (selected classes)
| Class name | RF | SVM | LightGBM | Ensemble (stacked) |
|---|---|---|---|---|
| Maize | 97.1 | 96.2 | 97.3 | 98.0 |
| Charlock | 93.5 | 91.4 | 94.0 | 95.1 |
| Bindweed (weed) | 91.0 | 88.7 | 92.6 | 93.4 |
| Wild oat (weed) | 89.6 | 87.9 | 91.3 | 92.5 |
Class-wise error rates
| Confused class pair | Misclassification rate (%) | Primary cause | Suggested solution |
|---|---|---|---|
| Charlock ↔ chickweed | 4.2 | High visual similarity | Integrate texture-based GLCM features |
| Wild oat ↔ fat hen | 3.7 | Overlapping foliage in images | Use morphological shape descriptors |
| Bindweed ↔ cleavers | 3.5 | Poor contrast under shadows | Adaptive histogram equalization (CLAHE) |
Individual classifier performance
| Classifier | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) | Inference time (ms/image) |
|---|---|---|---|---|---|
| RF | 94.1 | 93.8 | 93.6 | 93.7 | 92.4 |
| SVM (RBF) | 92.7 | 91.9 | 92.3 | 92.1 | 108.7 |
| XGBoost | 94.5 | 94.2 | 94.0 | 94.1 | 79.6 |
| LightGBM | 94.8 | 94.6 | 94.3 | 94.4 | 76.3 |
Ensemble model evaluation
| Ensemble strategy | Base models | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) |
|---|---|---|---|---|---|
| Hard voting | RF + SVM + LightGBM | 95.1 | 95.0 | 94.9 | 94.9 |
| Soft voting | RF + XGB + LightGBM | 95.3 | 95.2 | 95.0 | 95.1 |
| Stacking (LogReg) | All four classifiers | 95.6 | 95.4 | 95.3 | 95.3 |
Feature selection comparison
| Feature set | No. of features | PCA used | Accuracy (%) | Training time (s) |
|---|---|---|---|---|
| All raw features | 52 | No | 91.6 | 48.3 |
| PCA (95% variance) | 20 | Yes | 93.2 | 33.2 |
| Mutual info selected | 18 | No | 94.0 | 31.4 |
| RFE (with RF) | 15 | No | 94.7 | 29.5 |
Dataset details
| Class type | Species name | Number of images | Source |
|---|---|---|---|
| Crop | Maize, sugar beet, cleavers, black-grass, etc. | 10,000 | Kaggle (plant seedlings) |
| Weed | Dandelion, thistle, bindweed, wild oat, crabgrass | 1,500 | Custom-curated weed dataset |
| Total | 15 classes | 11,500 |
Comparative performance with existing works
| Study | Dataset size | Methodology | Accuracy (%) | Feature selection | Ensemble used |
|---|---|---|---|---|---|
| Rajendran and Thirunavukkarasu [7] | 11,500 | Traditional ML + textural | 94.3 | Manual selection | No |
| Gai et al. [3], JFR | 4,000 | Color + Depth Fusion + ML | 90.7 | PCA | No |
| Hasan et al. [11], Crop Prot. | 8,500 | Lightweight DL (YOLO-based) | 93.5 | CNN features | No |
| Moldvai et al. [8], Applied Sciences | 3,200 | Vision + CNN (small dataset) | 92.1 | Image Thresholding | No |
| Proposed Method (2025) | 11,500 | ML + feature engineering + stacking | 95.6 | PCA + RFE | Yes (stacking) |