Enhancing human activity recognition with multi-head self-attention and stacked autoencoders

S. Anandanarayanan; S. Thirumaran

doi:10.2478/ijssis-2026-0024

.blurhash-client-img { display: none !important; }

Enhancing human activity recognition with multi-head self-attention and stacked autoencoders

International Journal on Smart Sensing and Intelligent Systems

Volume 19 (2026): Issue 1 (January 2026)

By: S. Anandanarayanan and S. Thirumaran

Open Access

|May 2026

Figures & Tables

Overview of proposed work. SAE, stacked autoencoder.

Pictorial representation of physical activities (standing, sitting, walking, running, lying down, and climbing stairs) along with their corresponding recognition accuracies based on the MHSA-SAE model. MHSA-SAE, multi-head self-attention enhanced stacked autoencoder.

Accuracy comparison. MASH, method for activity sleep harmonization; MHSA-SAE, multi-head self-attention enhanced stacked autoencoder; SVM, support vector machine.

F1 score comparison. MHSA-SAE, multi-head self-attention enhanced stacked autoencoder; SVM, support vector machine.

ROC curve. AUC, area under the receiver operating characteristic curve; MHSA-SAE, multi-head self-attention enhanced stacked autoencoder.

Ablation study – impact of model components

Configuration	Accuracy (%)	F1-score (%)
SAE only (no attention)	93.10	91.90
SAE + single-head attention	95.30	94.40
Proposed study	97.82	96.67

j_ijssis-2026-0024_tab_006

Algorithm: MHSA-SAE
Input: X ∈ ℝ^n×t×d
Output: Y ∈ ℝ^n×c
1. Preprocess:
X ← Normalize(X)
2. Encode:
H ← f_SAE (X)
3. Apply Attention:
Q, K, V ← Linear(H)
A ← MHSA(Q, K, V)
4. Classify:
Y ← Softmax (W_c · A+b_c)
Return Y

Comparison with existing methods

Model/Methodology	Accuracy (%)	F1-score (%)	AUC (%)
Chong et al. [15] - Feature selection + SVM	91.80	90.50	92.30
Jeong et al. [12] – DL with noninvasive biomarkers	93.20	92.70	93.90
Dooley et al. [13] - MASH harmonization with wearables	94.60	93.80	95.10
Proposed MHSA-SAE (Ours)	97.82	96.67	98.10

Performance of MHSA-SAE on test set

Metric	Value (%)
Accuracy	97.82
Precision	96.45
Recall	96.90
F1-score	96.67
AUC	98.10

Class-wise precision, recall, and F1-score

Activity Class	Precision (%)	Recall (%)	F1-Score (%)
Standing	98.1	98.4	98.2
Walking	97.3	96.9	97.1
Running	95.0	94.6	94.8
Sitting	96.7	97.0	96.8
Lying down	94.5	95.3	94.9
Climbing stairs	93.1	92.7	92.9

Training vs_ validation accuracy over epochs

Epoch	Training accuracy (%)	Validation accuracy (%)
10	84.60	82.30
30	91.20	89.70
60	96.10	94.80
90	97.80	97.20
100	98.00	97.40

References

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.2478/ijssis-2026-0024 | Journal eISSN: 1178-5608

Journal RSS Feed

Language: English

Submitted on: May 19, 2025

Published on: May 15, 2026

Published by: Macquarie University, Australia

In partnership with: Paradigm Publishing Services

Publication frequency: 1 issue per year

Keywords:

sedentary behavior,

health risk prediction,

stacked autoencoder (SAE),

multi-head self-attention (MHSA)

Related subjects:

Engineering,

Introductions and overviews,

Engineering, other

© 2026 S. Anandanarayanan, S. Thirumaran, published by Macquarie University, Australia
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 19 (2026): Issue 1 (January 2026)

Enhancing human activity recognition with multi-head self-attention and stacked autoencoders

Figures & Tables

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Ablation study – impact of model components

j_ijssis-2026-0024_tab_006

Comparison with existing methods

Performance of MHSA-SAE on test set

Class-wise precision, recall, and F1-score

Training vs_ validation accuracy over epochs

Paradigm

My account