Abstract
Few-shot learning (FSL) aims to transfer knowledge from known to unknown categories using limited samples. However, the opaque nature of neural networks makes it challenging to discern the knowledge learned by the model, and existing methods often lack explainability, limiting their reliable application in high-stakes fields such as medical diagnosis and autonomous driving. To address this, we propose a visually explainable dynamic similarity network (VEDSNet), which achieves a balance of performance, explainability, and efficiency through a lightweight architecture (approximately 6.8M parameters, built on a ViT-Tiny backbone). The Feature Decomposition Module (FDM) generates fine-grained, semantically meaningful representations via parallel feature learning, providing intuitive visual insights into the model’s decisions. The Dynamic Metric Module (DMM) employs a sample-adaptive dual-metric strategy to enhance discrimination with limited data, switching to a single metric for efficiency when data is sufficient. Experiments on standard datasets demonstrate that VEDSNet achieves high classification accuracy while providing clear visual explanations of its decision-making process, making it suitable for efficient deployment in resource-constrained scenarios.