Have a personal or library account? Click to login
An optimized framework for epileptic seizure detection using DWT-based feature extraction and hybrid dimensionality reduction Cover

An optimized framework for epileptic seizure detection using DWT-based feature extraction and hybrid dimensionality reduction

Open Access
|Dec 2025

Full Article

I.
Introduction

Epilepsy is a chronic neurological disorder characterized by abnormal electrical activity in the brain, known as seizures, which can be described metaphorically as an “electrical storm” in the brain [1, 2]. The main physiological basis of epilepsy is a disturbance in the brain’s electrical activity, which can arise from a range of causes, including low blood sugar levels and reduced oxygen supply during childbirth [3, 4]. Globally, around 50 million people are affected by epilepsy, and approximately 100 million individuals experience at least one seizure in their lifetime [3, 5]. The disorder accounts for about 0.5% of the global disease burden, with control rates remaining as low as 0.5%–1% [6]. Epileptic seizures can be detected by monitoring brain activity using electroencephalogram (EEG) or electrocorticography (ECoG), which record electrical signals generated by neurons in the brain. These signals are inherently complex, non-linear, non-stationary, and often contain significant noise, presenting substantial challenges for seizure detection and classification. Effective seizure detection relies on identifying patterns within the EEG data without compromising performance, which has been approached through various machine learning (ML) classifiers. The primary challenge, however, lies in selecting optimal features and classifiers to maximize detection accuracy. In recent years, there has been a focus on utilizing ML algorithms with statistical feature representations, including both “black-box” and “non-black-box” methods, to improve seizure detection accuracy [7].

Epileptic seizures are commonly classified into two main types based on symptoms: partial seizures, affecting a specific area of the brain, and generalized seizures, involving multiple brain regions simultaneously [5, 8]. For feature extraction, the discrete wave-let transform (DWT) [9] has demonstrated substantial effectiveness in analyzing EEG signals across different frequency bands, enabling the identification of key features that facilitate epilepsy detection. DWT provides a robust framework for extracting and isolating relevant features, thereby reducing the data dimensionality and enhancing the efficiency of classifier performance. To further enhance the classification process, several dimensionality reduction techniques, including principal component analysis (PCA) [10], independent component analysis (ICA) [11], and linear discriminant analysis (LDA) [12], are used to remove irrelevant features while retaining key signal characteristics.

In the literature, numerous studies have explored the use of statistical functions—such as standard deviation, average power, mean absolute value, mean, variance, Shannon entropy, and skewness—as features for classifying EEG signals with various ML methods. A comparative analysis of classifiers, including support vector machine (SVM) [13], Naive Bayes (NB) [14], and K-nearest neighbors (KNN) [15], has shown that SVM generally excels in accuracy, whereas NB and KNN produce favorable results under certain conditions [1]. However, the high-dimensional feature space in these models often leads to increased time complexity, posing a significant limitation for real-time seizure detection applications.

Another common challenge in EEG signal processing is noise from sources such as electrooculogram (EOG) and electrocardiogram (ECG) artifacts, which can distort the analysis of brain activity. To address this, recent methods have applied a combination of ICA, common spatial pattern (CSP), and wavelet transformation to decompose EEG signals into independent components, isolate wavelet coefficients, and remove noise. By applying CSP to denoise the EEG data, these approaches maintain the neural activity patterns essential for seizure detection [16].

This study aims to address these challenges by proposing a two-step feature reduction process that improves classifier’s accuracy while reducing time complexity. First, we apply DWT for feature extraction, followed by three distinct dimensionality reduction techniques (PCA, ICA, and LDA) to reduce feature dimensions. Second, a feature-level fusion technique is applied to achieve further dimensionality reduction. Finally, we employ three different classifiers (SVM, NB, and KNN) to assess the performance of this hybrid model in detecting epilepsy.

Despite extensive research, real-time seizure detection remains hindered by noisy EEG signals and high-dimensional feature spaces. We propose a novel hybrid framework that integrates DWT-based feature extraction with layered dimensionality reduction and classifier optimization. Our model not only achieves 100% accuracy with LDA-NB but also demonstrates robustness across multiple EEG subsets, outperforming traditional approaches.

The main contributions of this study are as follows:

  • A two-step feature reduction process for epileptic seizure detection using DWT for feature extraction is proposed, followed by dimensionality reduction techniques, namely PCA, ICA, and LDA, to effectively reduce the feature space.

  • A feature-level dimension reduction technique is applied to optimize feature dimensions, enhancing computational efficiency without compromising classification accuracy.

  • The performance is evaluated using three different classifiers—SVM, NB, and KNN—in detecting epilepsy based on the proposed feature extraction and reduction pipeline.

  • The proposed approach achieves high accuracy in seizure detection, with the LDA and NB combination yielding 100% accuracy on the Bonn data-set, demonstrating the effectiveness of this hybrid model over existing methods.

  • Extensive experiments on EEG data with noise removal techniques are conducted, demonstrating the robustness of the proposed model in handling noise while preserving essential neural activity patterns for accurate seizure detection.

The remainder of this paper is organized as follows: Section II reviews existing epilepsy detection methods, Section III presents relevant techniques, Section IV details our proposed methodology and algorithm, Section V provides the dataset description, data preprocessing, evaluation metrics, and experimental setups, and Section VI highlights and discusses our findings. We conclude the paper in Section VII.

II.
Literature Review

This section provides an over view of previous research on epilepsy prediction techniques, highlighting the role of ML in health and biological data-sets. According to Prochazka et al. [17] and Çınar and Acır [18], numerous ML applications have emerged within healthcare, facilitating improved outcomes. Researchers, particularly in data mining and ML, have proposed solutions to enhance seizure detection accuracy, employing various classifiers such as artificial neural networks (ANN), SVM, decision trees, and random forests [19, 20]. ML has proven instrumental in extracting meaningful patterns from health-related datasets and contributes to addressing challenges in healthcare [19,20,21,22,23,24,25,26]. Applications of these techniques to brain datasets include seizure detection, epilepsy lateralization, differentiation of seizure states, and localization [19, 20].

Amin et al. [27] proposed a feature extraction method using DWT on EEG signals, in which relative wavelet energy was calculated from both the approximation and detail coefficients at the final decomposition level. Their study demonstrated high classification accuracy, achieving 98% with SVM for EEG signals recorded under cognitive and resting conditions, highlighting DWT’s efficacy in distinguishing complex cognitive states. Al-Qerem et al. [1] also employed the DWT in feature extraction, leveraging differential evolution for feature selection. Their model, tested with the Bonn dataset, achieved optimal classification by testing seven types of wavelets, including Discrete Meyer, Reverse Biorthogonal, and Daubechies, and proved efficient for EEG classification. Another novel method for classifying date fruits by analyzing their texture using scattering wavelet transforms, which extract robust features unaffected by shape or color similarities. These features are then processed through a stacking ensemble model combining random forest, SVM, and logistic regression to achieve highly accurate classification across diverse date varieties [28].

Epilepsy, a chronic neurological condition, is often diagnosed through EEG and ECoG data, which are highly complex, noisy, and non-stationary, presenting challenges for accurate seizure detection. ML methods are increasingly applied to process these signals, effectively identifying patterns while addressing noise and non-linearity. Siddiqui et al. [7] reviewed various ML techniques in seizure detection, categorizing methods based on statistical features and classifier types. Their study highlighted the importance of selecting suitable classifiers and features in this domain, distinguishing between black-box and transparent models.

Classification framework for Arabic biomedical questions using transformer-based contextual embeddings, enhancing semantic understanding and domain-specific accuracy. By leveraging deep language models, the system achieves improved performance in categorizing complex medical queries in Arabic [29]. This study proposed a deep learning model using convolutional neural network (CNN) and Transfer Learning to classify breast cancer from ultrasound images with high accuracy. By preprocessing the breast ultrasound images (BUSI) dataset and deploying the model in a desktop application, it achieved >90% accuracy, highlighting its clinical potential [30]. Another way evaluates how different activation functions—rectified linear unit (ReLU), Leaky ReLU, Sigmoid, and Tanh—impact the performance of CNN architectures (ResNet50, VGG16, and GoogleNet) in classifying images affected by Poisson noise. It finds that ResNet50 combined with ReLU consistently delivers the highest accuracy across datasets with 3, 5, and 10 classes, demonstrating strong resilience to noise [31]. Several studies have utilized dimensionality reduction methods to enhance classifier performance for seizure detection. Martis et al. [32] used DWT for ECG signal analysis and applied PCA, LDA, and ICA for feature reduction, achieving an accuracy of 99.28% with SVM and neural networks. Shi et al. [33] introduced a binary harmony search (BHS) algorithm for selecting optimal EEG channels, significantly improving classification accuracy over conventional CSP methods in motor imagery classification.

Additional studies have developed robust feature extraction and denoising methods to address the noisy nature of EEG signals. For example, Geng et al. [16] integrated ICA, DWT, and CSP for EEG signal processing, achieving improved accuracy in identifying EOG and ECG artifacts. Kapoor et al. [34] developed a hybrid model using AdaBoost, random forest, and decision tree classifiers, achieving >96% accuracy on the children’s hospital boston Children’s Hospital Boston (CHB-MIT) dataset. Amin et al. [27] similarly reported 98% accuracy using DWT and SVM, multi-layer perceptron and the kNN on the Bonn dataset.

Furthermore, dimensionality reduction techniques for brain-computer interface (BCI) applications were explored by Tan et al. [35], who utilized the Bonn dataset with classifiers such as SVM and KNN, attaining high classification accuracy. Priyanka et al. [36] also employed ANN on the Bonn dataset, achieving 96.9% accuracy.

These studies underscore the growing potential of ML approaches in improving epilepsy diagnosis and advancing seizure detection methodologies. This research builds upon prior work by integrating DWT for feature extraction, dimensionality reduction, and classifier optimization, further contributing to the field of automated epilepsy prediction.

III.
Background

In this section, we discuss the various tools and techniques utilized in our proposed methodology, focusing on feature extraction using DWT and dimensionality reduction methods, namely PCA, ICA, and LDA.

a.
DWT

DWT [9] is widely used in signal processing due to its ability to provide time–frequency localization, making it well suited for analyzing non-stationary signals, such as EEG and ECG [37]. DWT decomposes a one-dimensional signal into two sub-bands: high-frequency (detail) and low-frequency (approximation). Given a signal x, DWT applies a low-pass filter g and a high-pass filter h to generate these sub-bands. This decomposition is mathematically described by: (1) yn=x×gn=k=xkgnk y\left[ n \right] = \left( {x \times g} \right)\left[ n \right] = \sum\limits_{k = - \infty }^\infty {x\left[ k \right]} g\left[ {n - k} \right]

The signal passes through filters in successive levels, with downsampling by a factor of 2 at each stage. The resulting approximations and details capture information across different frequency bands (as illustrated in Figure 1), facilitating multi-level decomposition of the signal: (2) ylown=k=xkg2nk, ylow\left[ n \right] = \sum\limits_{k = - \infty }^\infty {x\left[ k \right]g\left[ {2n - k} \right],} (3) yhighn=k=xkh2nk. y{\rm{high}}\left[ n \right] = \sum\limits_{k = - \infty }^\infty {x\left[ k \right]} h\left[ {2n - k} \right].

Figure 1:

Second level of coefficients.

The Daubechies wavelet family (db4) was selected due to its proven effectiveness in EEG signal analysis, particularly for seizure detection. This wavelet offers good time-frequency localization and has been widely used in previous studies.

The EEG signals were decomposed up to the fifth level using db4 wavelet. This level was selected to capture the frequency bands relevant to epileptic seizure activity, particularly the delta, theta, alpha, beta, and gamma bands.

b.
PCA

PCA [10] is a linear dimensionality reduction method that projects data along directions of maximum variance, preserving essential information while reducing dimensionality. It is computed through the covariance matrix C of the data, followed by eigenvalue decomposition: (4) C=Xx¯Xx¯T. C = \left( {X - \bar x} \right){\left( {X - \bar x} \right)^T}.

Eigenvectors and eigenvalues of C determine the principal components. The transformed data are then obtained by projecting the original data onto these components, emphasizing variance [38]: (5) Projecteddata=VTXx¯TT. {\rm{Projected}}\,{\rm{data}} = {\left[ {{V^T}{{\left( {X - \bar x} \right)}^T}} \right]^T}.

c.
ICA

ICA [11] separates statistically independent components from mixed signals, often used for noise reduction in EEG signal processing. Given a vector x of mixed signals and a matrix A representing weights, the ICA model is formulated as: (6) x=Asorx=i=1naisi. {\rm{x}} = As\,or\,x = \sum\limits_{i = 1}^n {{a_i}{s_i}.}

The goal is to estimate both A and the source signals s under the assumption of non-Gaussian, statistically independent sources [7].

d.
LDA

LDA [12] maximizes separation between classes by finding a linear combination of features that best discriminates among them. For a class mean vector µi and pooled covariance Σ, the linear score function is defined as: (7) SiLX=12μi1μi+μi1x+logPπi S_i^L\left( X \right) = - {1 \over 2}{\mu _i}\sum {^{ - 1}} {\mu _i} + {\mu _i}\sum {^{ - 1}} x + {\rm{log}}P\left( {{\pi _i}} \right)

LDA is commonly used for dimensionality reduction and classification in high-dimensional data.

IV.
Proposed Approach

The proposed methodology involves a multi-step approach for epileptic seizure detection, as depicted in Figure 2. The process involves:

  • Applying DWT for multi-level decomposition to capture relevant features.

  • Reducing feature dimensionality through PCA, ICA, and LDA.

  • Classifying the features using SVM, KNN, and NB classifiers.

Figure 2:

Block diagram of the proposed method. DWT, discrete wavelet transform; ICA, independent component analysis; KNN, K-nearest neighbor; LDA, linear discriminant analysis; NB, Naive Bayes; PCA, principal component analysis; SVM, support vector machine.

The fifth level of DWT decomposition provides six subbands of EEG signals (Figure 3), enabling a robust classification. Each subband is then processed using PCA, ICA, or LDA, and classified for epileptic seizure detection with high accuracy.

Figure 3:

Block diagram of fifth level decomposition of the EEG signal. EEG, electroencephalogram.

Furthermore, Algorithm 1 presents the workflow of the proposed approach. The algorithm for epileptic seizure detection is designed to process EEG signal data by following a structured approach that includes pre-processing, feature extraction, dimensionality reduction, and classification. First, the EEG data, denoted as X, undergo pre-processing. This step involves normalizing the data to bring all sample values to a consistent scale, which helps improve the stability and performance of classifiers.

In the preprocessing step, 0.53 40 Hz filter has been used to remove the noise or artifacts, followed by normalization in the subsequent step.

Normalization is particularly important in EEG signal processing as it enhances the model’s ability to detect subtle variations that could indicate a seizure, while also reducing the influence of noise or artifacts in the data.

After pre-processing, the algorithm proceeds with feature extraction using the DWT. DWT is an effective tool for decomposing EEG signals into various frequency bands, which enables the analysis of distinct signal characteristics relevant to seizure detection. Here at first we have increased the wavelet decomposition up to seventh level and for each level change the mother wavelet Daubechies from db1 to db10. Computing performance for each case, it is noticed that level 5 and db1 provide the best result. Actually, during our experiment, one by one, we have increased the level up to 7. However, we observed that the proposed model provides the best performance at fifth level.

Algorithm 1:
Epileptic seizure detection

Input: EEG data X

Output: Seizure or non-seizure classification

Step 1: Pre-processing

Normalize the EEG data X to a consistent scale.

Step 2: Feature Extraction

Apply DWT on X to decompose the EEG signal.

Extract features from the fifth-level decomposition coefficients.

Step 3: Dimensionality Reduction

Choose one of the following methods for dimensionality reduction:

  • PCA

  • ICA

  • LDA

Obtain reduced feature set Freduced.

Step 4: Classification

Choose one of the following classifiers:

  • SVM

  • KNN

  • NB

Train the classifier on Freduced.

Step 5: Testing and Evaluation

Test the trained model on unseen data.

Compute evaluation metrics (e.g., accuracy, precision, and recall).

Return Classification result as seizure or non-seizure = 0

In Algorithm 1, we specifically use the fifth level of decomposition, which provides a balance between detail and approximation by capturing the primary subbands associated with seizure activity. The resulting DWT coefficients at this level serve as the extracted features, representing the signal’s most informative frequencies in terms of seizure detection.

To reduce the high dimensionality of the extracted features, the algorithm applies a dimensionality reduction technique. Users can choose among three methods: PCA, ICA, or LDA. PCA reduces dimensions by finding orthogonal components that account for the most variance, which allows us to retain the most informative aspects of the data. ICA, on the contrary, identifies statistically independent components, making it effective in reducing noise and redundancy. LDA is particularly advantageous in supervised tasks like seizure detection as it maximizes the separability between seizure and non-seizure classes. The outcome of this step is a reduced feature set Freduced, which retains the essential signal characteristics while lowering computational complexity.

With the reduced feature set prepared, the algorithm then moves to the classification stage. The user can select from three classification methods: SVM, KNN, or NB. SVM is a powerful classifier that finds the optimal hyperplane to separate seizure and non-seizure classes, making it highly suitable for high-dimensional spaces. KNN, on the contrary, is a simpler approach that classifies samples based on their proximity to KNN in the feature space, providing an intuitive method for pattern recognition. NB employs a probabilistic framework based on Bayes’ theorem, which is often effective for binary classification tasks such as this. The classifier is trained on labeled EEG data, allowing it to learn the distinctions between seizure and non-seizure patterns.

Finally, the trained model is tested on unseen data, where it predicts seizure or non-seizure labels for each sample. Evaluation metrics, such as accuracy, precision, and recall, are computed to measure the model’s performance and assess its effectiveness in detecting seizures accurately. The output of the algorithm is a classification result indicating whether each sample is labeled as a seizure or non-seizure event. By combining DWT-based feature extraction with dimensionality reduction and reliable classification methods, this algorithm provides a comprehensive approach to accurately identify epileptic seizures from EEG data.

V.
Experimental Setup
a.
Dataset description

For the evaluation of the proposed epileptic seizure detection techniques, we have used the Bonn dataset [39], a widely recognized dataset for seizure detection tasks. The Bonn dataset, which is publicly available, consists of electroencephalogram (EEG) signals recorded from healthy individuals and those diagnosed with epilepsy. It includes data captured in both pre-seizure and post-seizure intervals. The data-set is divided into multiple subsets, each containing EEG data from different subjects, recorded in various settings (e.g., resting, seizure, etc.).

The dataset features time-series EEG signals sampled at 173.61 Hz, with each sample being divided into 23.6-s segments. These segments contain a variety of seizure and non-seizure classes, making the dataset ideal for classification tasks. Each signal is represented by a series of 4,097 data points, making it a suitable choice for the analysis of temporal dynamics involved in epileptic seizures.

In this study, the Bonn dataset was pre-processed using DWT to extract meaningful features from the raw EEG signals. This feature extraction process helps in capturing both time and frequency-domain characteristics, which are essential for distinguishing between normal and seizure conditions. Further dimensionality reduction was performed using techniques such as PCA, ICA, and LDA to enhance the classification performance. For evaluation, we have used standard classification algorithms such as KNN, NB, and SVM.

Details of the explanation of the Bonn dataset are provided in Table 1. By applying the novel seizure detection techniques on this dataset, we aim to assess the effectiveness and robustness of the proposed framework, especially with respect to high classification accuracy and generalizability across different conditions.

Table 1:

Samples of data in normal and seizure cases

Set nameAnnotation of dataSize (KB)Acquisition circumstances
Set AZ000.txt—Z100.txt564Five healthy subjects with open eyes
Set BO000.txt—O100.txt611Five healthy subjects with closed eyes
Set CN000.txt—N100.txt560Five people with epilepsy with seizure-free status
Set DF000.txt—F100.txt569Five people with epilepsy with seizure-free status inside five epileptogenic zones
Set ES000.txt—S100.txt747Five subjects during seizure activity
b.
Data preprocessing

The dataset used in this study is the Bonn dataset, which consists of EEG signals collected from healthy subjects and individuals with epilepsy. The raw EEG data are pre-processed to remove any noise or artifacts, ensuring that only relevant signals are considered for feature extraction. The data are segmented into non-overlapping windows of 23.6 s, as this time window is sufficient to capture the underlying temporal dynamics of epileptic seizures. To standardize the signals across different subjects, each EEG signal is normalized to have a mean of zero and a standard deviation of one, ensuring consistency in the features extracted from different individuals.

c.
Evaluation metrics

The performance of the proposed techniques is evaluated using standard classification metrics [40], including accuracy, sensitivity, specificity, precision, and F1-score. Accuracy is the proportion of correctly classified instances, while sensitivity and specificity measure the classifier’s ability to detect seizures and non-seizures, respectively. Precision and F1-score provide further insights into the classifier’s performance, especially in dealing with imbalanced datasets.

For the statistical analysis, we conducted cross-validation using a 10-fold cross-validation technique to ensure the reliability and generalizability of the results. This process helps in mitigating overfitting and provides a more robust evaluation of the classifier’s performance.

d.
Implementation details

All experiments are implemented using the Python programming language and ML libraries. The experiments are run on a standard desktop machine with an Intel i7 processor and 16 GB of RAM.

VI.
Results and Discussion

The performance metrics for the proposed approach are presented in Tables 2–4, where ICA is combined with KNN, SVM, and NB classifiers. The values obtained from each of the 100 runs are averaged and mentioned in Tables 2–4. These tables highlight the comparative metrics for each classification model. According to the data, the highest average accuracy (i.e., 100%) was achieved with the ICA and NB combination in the 2-class classification for the A–E dataset. Additional metrics, such as F-measure, recall, specificity, sensitivity, and precision, reported in Tables 2–4, further validate the model’s performance.

Table 2:

Performance metrics for ICA with KNN

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C88.5093.9981.1485.2493.990.89
A–D83.5082.6583.8783.9982.650.82
A–E93.00100.0086.0188.36100.000.94
B–C91.5093.3389.8390.4793.330.92
B–D91.5093.3190.3790.1293.310.91
B–E92.00100.0084.0786.32100.000.92

ICA, independent component analysis; KNN, K-nearest neighbor.

Table 3:

Performance metrics for ICA with SVM

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C88.0092.6883.6786.3392.680.89
A–D85.5096.0375.9679.1496.030.86
A–E97.50100.0094.9695.55100.000.98
B–C86.5083.6389.6591.0783.630.87
B–D90.5089.6392.5991.7289.630.90
B–E94.50100.0088.7890.69100.000.95

ICA, independent component analysis; SVM, support vector machine.

Table 4:

Performance metrics for ICA with NB

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C72.0082.6561.5267.8982.650.74
A–D72.5097.0347.7665.5597.030.78
A–E100.00100.00100.00100.00100.001.00
B–C82.0067.2295.4895.6067.220.78
B–D68.0091.4345.2762.4891.420.74
B–E99.5099.23100.00100.0099.230.99

ICA, independent component analysis; NB, Naive Bayes.

The proposed method achieved a maximum average sensitivity of 100% across all three combinations (ICA + KNN, ICA + NB, and ICA + SVM) for the A–E dataset. As observed in Table 3, a maximum average sensitivity of 100% was also attained for the B–E dataset. A comparison across Tables 2–4 indicates that the maximum average specificity and F-measure values were obtained for the A–E dataset.

Tables 5–7 illustrate the performance metrics for the PCA dimensionality reduction technique with three classifiers: NB, SVM, and KNN. Based on the data in tables, the NB classifier achieved the highest accuracy of 100% for the A–E dataset, with recall and F-measure values also reaching 100%. Specificity values of 100% were observed for datasets A–E and B–E using SVM and KNN, respectively.

Table 5:

Results of PCA with KNN algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C81.0088.4771.8778.3688.470.83
A–D90.5093.2089.2889.3493.200.91
A–E58.00100.0018.1857.18100.000.71
B–C81.0082.3277.9379.3982.320.81
B–D83.5083.0981.9483.1283.090.89
B–E88.50100.0075.0786.68100.000.92

KNN, K-nearest neighbors; PCA, principal component analysis.

Table 6:

Results of PCA with SVM algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C77.5096.8755.5771.0996.870.81
A–D84.5097.0771.4377.7797.070.86
A–E93.50100.0086.8789.34100.000.94
B–C83.0093.6571.8977.8993.650.85
B–D85.0093.8574.3680.2793.850.86
B–E90.00100.0080.0983.98100.000.91

PCA, principal component analysis; SVM, support vector machine.

Table 7:

Results of PCA with NB algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C80.5094.4567.1274.1794.450.83
A–D80.0096.2663.6772.6496.260.82
A–E100.00100.00100.00100.00100.001.00
B–C90.5084.2895.7197.5084.280.90
B–D89.0090.3188.3887.9590.310.89
B–E99.5099.00100.00100.0099.000.99

NB, Naive Bayes; PCA, principal component analysis.

Tables 8–10 provide performance metrics for the LDA technique using the NB, SVM, and KNN classifiers, respectively. Comparing Tables 2 and 10, the LDA + NB combination yields the best results across all metrics for all datasets. NB achieved 100% accuracy for datasets A–C, A–D, A–E, B–C, B–D, and B–E, with specificity and precision values also at 100% for all dataset combinations.

Table 8:

Results of proposed model LDA with KNN algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C77.5082.9570.3076.8982.950.78
A–D66.5064.4567.8469.4064.450.66
A–E92.00100.0084.6586.49100.000.92
B–C76.5064.0887.3282.8964.080.72
B–D80.0073.4482.9885.5473.440.77
B–E90.00100.0079.6184.92100.000.91

KNN, K-nearest neighbors; LDA, linear discriminant analysis.

Table 9:

Results of proposed model LDA with SVM algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C100.00100.00100.00100.00100.001.00
A–D72.0072.4873.3672.7572.480.71
A–E96.0099.0990.6395.5699.090.97
B–C91.0086.7094.0293.5786.700.90
B–D100.00100.00100.00100.00100.001.00
B–E76.0088.3863.8874.3188.380.80

LDA, linear discriminant analysis; SVM, support vector machine.

Table 10:

Results of proposed model LDA with NB algorithm

CaseAccuracy (%)Sensitivity (%)Specificity (%)Precision (%)Recall (%)F-measure
A–C100.00100.00100.00100.00100.001.00
A–D100.00100.00100.00100.00100.001.00
A–E100.00100.00100.00100.00100.001.00
B–C100.00100.00100.00100.00100.001.00
B–D100.00100.00100.00100.00100.001.00
B–E100.00100.00100.00100.00100.001.00

LDA, linear discriminant analysis; NB, Naive Bayes.

Figures 4–6 display the confusion matrices for PCA combined with KNN, SVM, and NB classifiers. Each matrix shows two primary classes: the actual (True) and the predicted. The layout provides insights into the classification accuracy and performance across these three models. Notably, the SVM-based matrix (Figure 5) differs slightly from those generated by KNN and NB, yet overall, all confusion matrices display robust class prediction capabilities, highlighting the efficacy of PCA in feature reduction.

Figure 4:

Confusion matrix for PCA with KNN algorithm. KNN, K-nearest neighbors; PCA, principal component analysis.

Figure 5:

Confusion matrix for PCA with SVM algorithm. PCA, principal component analysis; SVM, support vector machine.

Figure 6:

Confusion matrix for PCA with NB algorithm. NB, Naive Bayes; PCA, principal component analysis.

Figures 7–9 further present the fold-wise accuracy for ICA, PCA, and LDA with SVM, NB, and KNN classifiers. In Figure 7, we observe that ICA paired with SVM yields higher accuracy than ICA combined with NB or KNN, emphasizing SVM’s strong predictive performance with ICA. Similarly, Figure 8 illustrates the fold-wise accuracy for PCA, where SVM consistently outperforms both KNN and NB. Interestingly, in the case of LDA (Figure 9), the NB classifier surpasses both SVM and KNN in accuracy, suggesting that LDA benefits more from NB’s probabilistic approach to classification.

Figure 7:

Fold-wise accuracy using ICA and SVM, NB, KNN. ICA, independent component analysis; KNN, K-nearest neighbors; NB, Naive Bayes; SVM, support vector machine.

Figure 8:

Fold-wise accuracy using PCA and SVM, NB, KNN. KNN, K-nearest neighbors; NB, Naive Bayes; PCA, principal component analysis; SVM, support vector machine.

Figure 9:

Fold-wise accuracy using LDA and SVM, NB, KNN. KNN, K-nearest neighbors; LDA, linear discriminant analysis; NB, Naive Bayes; SVM, support vector machine.

Finally, Figure 10 presents the receiver operating characteristic curve (ROC) plots for the classifiers (SVM, NB, and KNN) with dimensionality reduction methods (PCA, ICA, and LDA). Each classifier-ROC plot remains nearly identical, with all models approaching 100% performance, indicating high true positive rates and minimal misclassification. This consistency suggests that the classifiers are well suited for the dataset and exhibit near-optimal performance across all evaluated dimensionality reduction techniques, confirming their reliability and robustness in classifying EEG signal sets.

Figure 10:

ROC plot for PCA, ICA, LDA and SVM, NB, KNN. ICA, independent component analysis; KNN, K-nearest neighbors; LDA, linear discriminant analysis; NB, Naive Bayes; PCA, principal component analysis; SVM, support vector machine.

The results presented in this study demonstrate the effectiveness of various dimensionality reduction techniques (PCA, ICA, and LDA) combined with different classifiers (KNN, SVM, and NB) for EEG signal classification. The confusion matrices for each classifier revealed high classification accuracy across multiple folds. The ROC plots further confirmed the robustness of the classifiers, with almost identical curves approaching 100% accuracy for all methods. These findings suggest that dimensionality reduction techniques enhance the classification capabilities of ML models for EEG analysis.

VII.
Conclusion

This study presents a novel and efficient approach to epileptic seizure detection by combining DWT for feature extraction with three dimensionality reduction techniques: PCA, LDA, and ICA. The proposed method leverages feature-level fusion to reduce the feature space, ensuring both computational efficiency and high classification accuracy. Experimental results demonstrate that the combination of LDA with NB achieves 100% accuracy, surpassing existing methods in performance. This low-dimensional feature space makes the method suitable for real-time, cost-effective clinical applications, addressing the challenges of time-consuming and expensive epilepsy diagnosis. The simplicity and accuracy of the approach highlight its potential as a practical solution for clinical use, enabling quick and reliable seizure detection. To validate the robustness of the performance differences observed across models, we propose incorporating statistical significance testing in future experiments. Future work will validate the method using larger datasets to assess its robustness and generalizability across diverse populations and seizure types.

Language: English
Submitted on: Jul 15, 2025
|
Published on: Dec 31, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Rabel Guharoy, Nanda Dulal Jana, Suparna Biswas, Lalit Garg, Subhayu Ghosh, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.