From public to clinical data: External validation of an explainable MedViT model for retinal OCT images

Samuel Gibala; Veronika Kurilova; Milos Oravec; Jarmila Pavlovicova; Jana Stefanickova

doi:10.2478/jee-2025-0050

Abstract

The performance of machine learning models is often evaluated using accuracy metrics, which provide only a partial view of their capabilities, particularly in medical imaging, where explainability is essential. In this study, we propose to use a hybrid CNN-Transformer model MedViT, for the classification of retinal OCT images. The model achieved 95.95% accuracy on the modified public Retinal OCT C8 dataset and 90% on a private independent external test dataset, correctly classifying 9 out of 10 cases. Explainability analysis with Grad-CAM showed that the model consistently focused on clinically relevant macular regions. These results indicate that the proposed approach can accurately detect pathological changes and focus on diagnostically important regions also in an independent dataset beyond the training data, making it a reliable and helpful tool for ophthalmologists in clinical practice.

References

A. H. Kashani, C. Chen, J. K. Gahm, F. Zheng, G. M. Richter, P. J. Rosenfeld, Y. Shi, and R. K. Wang, “Optical coherence tomography angiography: A comprehensive review of current methods and clinical applications,” Progress in Retinal and Eye Research, vol. 60, pp. 66–100, 2017. doi:10.1016/j.preteyeres.2017.07.002.
Search in Google Scholar Back to article
A. P. Voigt, N. K. Mullin, S. S. Whitmore, A. P. DeLuca, E. R. Burnight, X. Liu, B. A. Tucker, T. E. Scheetz, E. M. Stone, and R. F. Mullins, “Human photoreceptor cells from different macular subregions have distinct transcriptional profiles,” Human Molecular Genetics, vol. 30, no. 16, pp. 1543–1558, 2021. doi:10.1093/hmg/ddab140.
Search in Google Scholar Back to article
O. N. Manzari, H. Ahmadabadi, H. Kashiani, S. B. Shokouhi, and A. Ayatollahi, “MedViT: A robust vision transformer for generalized medical image classification,” Computers in Biology and Medicine, vol. 157, p. 106791, 2023. doi:10.1016/j.compbiomed.2023.106791.
Search in Google Scholar Back to article
A. Chaddad, J. Peng, J. Xu, and A. Bouridane, “Survey of Explainable AI Techniques in Healthcare,” Sensors, vol. 23, no. 2, p. 634, 2023. doi: 10.3390/s23020634.
Search in Google Scholar Back to article
Z. Sadeghi, R. Alizadehsani, M. A. CIFCI, S. Kausar, R. Rehman, P. Mahanta, P. K. Bora, A. Almasri, R. S. Alkhawaldeh, S. Hussain, B. Alatas, A. Shoeibi, H. Moosaei, M. Hladík, S. Nahavandi, and P. M. Pardalos, “A review of Explainable Artificial Intelligence in healthcare,” Comput. Electr. Eng., vol. 118, p. 109370, Aug. 2024. doi: 10.1016/j.compeleceng.2024.109370.
Search in Google Scholar Back to article
T. F. Tan, P. Dai, X. Zhang, L. Jin, S. Poh, D. Hong, J. Lim, G. Lim, Z. L. Teo, N. Liu, and D. S. W. Ting, “Explainable artificial intelligence in ophthalmology,” Current Opinion in Ophthalmology, vol. 34, no. 5, pp. 422–430, 2023. doi:10.1097/ICU.0000000000000983.
Search in Google Scholar Back to article
O. S. Naren, “Retinal OCT Image Classification – C8,” Kaggle, 2024. doi:10.34740/KAGGLE/DSV/9595300.
Search in Google Scholar Back to article
M. Subramanian, K. Shanmugavadivel, O. S. Naren, K. Premkumar, and K. Rankish, “Classification of retinal OCT images using deep learning,” in Proc. 2022 Int. Conf. Computer Communication and Informatics (ICCCI), 2022, pp. 1–7. doi:10.1109/ICCCI54379.2022.9740985.
Search in Google Scholar Back to article
K. Karthik and M. Mahadevappa, “Convolution neural networks for optical coherence tomography (OCT) image classification,” Biomedical Signal Processing and Control, vol. 79, p. 104176, 2023. doi:10.1016/j.bspc.2022.104176.
Search in Google Scholar Back to article
D. S. Kermany et al., “Identifying medical diagnoses and treatable diseases by image-based deep learning,” Cell, vol. 172, no. 5, pp. 1122–1131.e9, 2018. doi:10.1016/j.cell.2018.02.010.
Search in Google Scholar Back to article
Ş. Aykat and S. Senan, “Using machine learning to detect different eye diseases from OCT images,” International Journal of Computational and Experimental Science and Engineering, vol. 9, pp. 62–67, 2023. doi:10.22399/ijcesen.1297655.
Search in Google Scholar Back to article
J. Jingzhen et al., “An interpretable transformer network for the retinal disease classification using optical coherence tomography,” Scientific Reports, vol. 13, no. 1, p. 3637, 2023. doi:10.1038/s41598-023-30853-z.
Search in Google Scholar Back to article
I. Khalil, A. Mehmood, H. Kim, and J. Kim, “OCTNet: A modified multi-scale attention feature fusion network with InceptionV3 for retinal OCT image classification,” Mathematics, vol. 12, no. 19, p. 3003, 2024. doi:10.3390/math12193003.
Search in Google Scholar Back to article
P. Gholami, P. Roy, M. K. Parthasarathy, and V. Lakshminarayanan, “OCTID: Optical coherence tomography image database,” Computers & Electrical Engineering, vol. 81, p. 106532, 2020. doi:10.1016/j.compeleceng.2019.106532.
Search in Google Scholar Back to article
A. C. Bovik, “Basic gray-level image processing,” in Handbook of Image and Video Processing, 2nd ed., A. C. Bovik, Ed. Burlington, MA, USA: Academic Press, 2005, pp. 21–37. doi:10.1016/B978-012119792-6/50066-8.
Search in Google Scholar Back to article
A. Buades, B. Coll, and J.-M. Morel, “Non-local means denoising,” Image Processing On Line, vol. 1, pp. 208–212, 2011. doi:10.5201/ipol.2011.bcm_nlm.
Search in Google Scholar Back to article
D. Hendrycks, N. Mu, E. D. Cubuk, B. Zoph, J. Gilmer, and B. Lakshminarayanan, “AugMix: A simple data processing method to improve robustness and uncertainty,” arXiv preprint arXiv:1912.02781, 2020. [Online]. Available: https://arxiv.org/abs/1912.02781.
Search in Google Scholar Back to article
J. Yang, R. Shi, D. Wei, Z. Liu, L. Zhao, B. Ke, H. Pfister, and B. Ni, “MedMNIST v2 – A large-scale lightweight benchmark for 2D and 3D biomedical image classification,” Scientific Data, vol. 10, no. 1, p. 41, Jan. 2023. doi:10.1038/s41597-022-01721-8.
Search in Google Scholar Back to article
J. Deng et al., “ImageNet: A large-scale hierarchical image database,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255. doi:10.1109/CVPR.2009.5206848.
Search in Google Scholar Back to article
J. Gildenblat and contributors, “PyTorch library for CAM methods,” GitHub repository, 2021. [Online]. Available: https://github.com/jacobgil/pytorch-grad-cam.
Search in Google Scholar Back to article

From public to clinical data: External validation of an explainable MedViT model for retinal OCT images

Abstract

Paradigm

My account