Have a personal or library account? Click to login

Enhanced weed and crop species classification using optimized machine learning and ensemble techniques

Open Access
|Oct 2025

Figures & Tables

Figure 1:

Overview of the proposed work. LightGBM, light gradient boosting machine; PCA, principal component analysis; RF, random forest; RFE, recursive feature elimination; SVM, support vector machine.
Overview of the proposed work. LightGBM, light gradient boosting machine; PCA, principal component analysis; RF, random forest; RFE, recursive feature elimination; SVM, support vector machine.

Figure 2:

F1 score analysis. SVM, support vector machine.
F1 score analysis. SVM, support vector machine.

Figure 3:

Accuracy analysis.
Accuracy analysis.

Class-wise F1-scores (selected classes)

Class nameRFSVMLightGBMEnsemble (stacked)
Maize97.196.297.398.0
Charlock93.591.494.095.1
Bindweed (weed)91.088.792.693.4
Wild oat (weed)89.687.991.392.5

Class-wise error rates

Confused class pairMisclassification rate (%)Primary causeSuggested solution
Charlock ↔ chickweed4.2High visual similarityIntegrate texture-based GLCM features
Wild oat ↔ fat hen3.7Overlapping foliage in imagesUse morphological shape descriptors
Bindweed ↔ cleavers3.5Poor contrast under shadowsAdaptive histogram equalization (CLAHE)

Individual classifier performance

ClassifierAccuracy (%)Precision (%)Recall (%)F1-score (%)Inference time (ms/image)
RF94.193.893.693.792.4
SVM (RBF)92.791.992.392.1108.7
XGBoost94.594.294.094.179.6
LightGBM94.894.694.394.476.3

Ensemble model evaluation

Ensemble strategyBase modelsAccuracy (%)Precision (%)Recall (%)F1-score (%)
Hard votingRF + SVM + LightGBM95.195.094.994.9
Soft votingRF + XGB + LightGBM95.395.295.095.1
Stacking (LogReg)All four classifiers95.695.495.395.3

Feature selection comparison

Feature setNo. of featuresPCA usedAccuracy (%)Training time (s)
All raw features52No91.648.3
PCA (95% variance)20Yes93.233.2
Mutual info selected18No94.031.4
RFE (with RF)15No94.729.5

Dataset details

Class typeSpecies nameNumber of imagesSource
CropMaize, sugar beet, cleavers, black-grass, etc.10,000Kaggle (plant seedlings)
WeedDandelion, thistle, bindweed, wild oat, crabgrass1,500Custom-curated weed dataset
Total15 classes11,500

Comparative performance with existing works

StudyDataset sizeMethodologyAccuracy (%)Feature selectionEnsemble used
Rajendran and Thirunavukkarasu [7]11,500Traditional ML + textural94.3Manual selectionNo
Gai et al. [3], JFR4,000Color + Depth Fusion + ML90.7PCANo
Hasan et al. [11], Crop Prot.8,500Lightweight DL (YOLO-based)93.5CNN featuresNo
Moldvai et al. [8], Applied Sciences3,200Vision + CNN (small dataset)92.1Image ThresholdingNo
Proposed Method (2025)11,500ML + feature engineering + stacking95.6PCA + RFEYes (stacking)
Language: English
Submitted on: Feb 1, 2025
Published on: Oct 4, 2025
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 R. Sathya, K.S. Thirunavukkarasu, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.