PSwinUNet: Bridging Local and Global Contexts for Accurate Medical Image Segmentation with Semi-Supervised Learning

Zhao, Zhixuan; Liu, Bailin; Zhang, Hongpei; Qian, Chentao; Zhang, Yijian

doi:10.2478/ijanmc-2025-0024

Abstract

It’s highly crucial to divide up medical photos correctly in order to make diagnoses and plan treatments. Convolutional Neural Networks (CNNs) are very good at picking up local information, but they have problems with long-range dependencies. On the other side, Vision Transformers (ViTs) are good at modeling global context, but they need a lot of computer power and labeled data. To get surrounding these difficulties, we establish PSwinUNet, a hybrid CNN-Transformer system based on a partially supervised learning the structure. Adding a SwinTransformer block to a U-shaped structure makes PSwinUNet better at learning internationally semantics and up-sampling. It also uses a polarized self-attention mechanism in skip connections to keep spatial information from getting lost when the image is downsampled. PSwinUNet does a better job than the best gets closer that are currently accessible when tested on the BUSI, DRIVE, and CVC-ClinicDB datasets. For instance, it earned Dice Similarity Coefficient (DSC) scores of 0.781, 0.896, and 0.960 based on the BUSI data set with 1/8, 1/2, and entire labeled information, respectively. These scores are substantially better than those of the old UNet and UNet++ models.

References

Guan, Z. et al. Artificial intelligence in diabetes management: advancements, opportunities, and challenges. Cell Reports Medicine (2023).
Search in Google Scholar Back to article
Shen, D, Wu, G. & Suk, H. -I. Deep learning in medical image analysis. Annu. Review biomedical engineering 19, 221–248 (2017).
Search in Google Scholar Back to article
Hatamizadeh, A. et al. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, 574–584 (2022).
Search in Google Scholar Back to article
Chen, X. et al. Learning active contour models for medical image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 11632–11640 (2019).
Search in Google Scholar Back to article
Chen, B., Liu, Y., Zhang, Z., Lu, G. & Kong, A. W. K. Transattunet: Multi-level attention-guided u-net with transformer for medical image segmentation. IEEE Transactions on Emerg. Top. Comput. Intell. (2023).
Search in Google Scholar Back to article
Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440 (2015).
Search in Google Scholar Back to article
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, 234–241 (Springer, 2015).
Search in Google Scholar Back to article
Ibtehaz, N. & Kihara, D. Acc-unet: A completely convolutional unet model for the 2020s. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 692 – 702 (Springer, 2023).
Search in Google Scholar Back to article
Zhou, Z., Rahman Siddiquee, M. M., Tajbakhsh, N. & Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018
Search in Google Scholar Back to article
1Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J. & Maier-Hein, K. H. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. methods 18, 203–211 (2021).
Search in Google Scholar Back to article
Huang, H. et al. Unet 3+: A full-scale connected unet for medical image segmentation. In ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), 1055–1059 (IEEE, 2020).
Search in Google Scholar Back to article
Oktay, O. et al. Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018).
Search in Google Scholar Back to article
Nazir, A. et al. Off-enet: An optimally fused fully end-to-end network for automatic dense volumetric 3d intracranial blood vessels segmentation. IEEE Transactions on Image Process. 29, 7192–7202 (2020).
Search in Google Scholar Back to article
Cao, H. et al. Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, 205–218 (Springer, 2022).
Search in Google Scholar Back to article
16. Chen, J. et al. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021).
Search in Google Scholar Back to article
Zhou, H.-Y. et al. nnformer: Interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201 (2021).
Search in Google Scholar Back to article
Heidari, M. et al. Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 6202–6212 (2023).
Search in Google Scholar Back to article
Wang, Y., Xiao, B., Bi, X., Li, W. & Gao, X. Mcf: Mutual correction framework for semi-supervised medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15651–15660 (2023).
Search in Google Scholar Back to article
Zhao, C. et al. Context-aware network fusing transformer and v-net for semi-supervised segmentation of 3d left atrium. Expert. Syst. with Appl. 214, 119105 (2023).
Search in Google Scholar Back to article
Bai, Y., Chen, D., Li, Q., Shen, W. & Wang, Y. Bidirectional copy-paste for semi-supervised medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11514–11524 (2023).
Search in Google Scholar Back to article
Yap, M. H. et al. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE journal biomedical health informatics 22, 1218–1226 (2017).
Search in Google Scholar Back to article
Staal, J., Abràmoff, M. D., Niemeijer, M., Viergever, M. A. & Van Ginneken, B. Ridge-based vessel segmentation in color images of the retina. IEEE transactions on medical imaging 23, 501–509 (2004).
Search in Google Scholar Back to article
Bernal, J. et al. Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Medical imaging graphics 43, 99–111 (2015).
Search in Google Scholar Back to article
Zhang, Z. et al. Self-aware and cross-sample prototypical learning for semi-supervised medical image segmentation. arXiv preprint arXiv:2305.16214 (2023).
Search in Google Scholar Back to article
Tsai, A. et al. A shape-based approach to the segmentation of medical imagery using level sets. IEEE transactions on medical imaging 22, 137–154 (2003).
Search in Google Scholar Back to article
Held, K. et al. Markov random field segmentation of brain mr images. IEEE transactions on medical imaging 16, 878–886 (1997).
Search in Google Scholar Back to article
Wang, X., Girshick, R., Gupta, A. & He, K. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018).
Search in Google Scholar Back to article
Peng, Y., Sonka, M. & Chen, D. Z. U-net v2: Rethinking the skip connections of u-net for medical image segmentation. arXiv preprint arXiv:2311.17791 (2023).
Search in Google Scholar Back to article
Zhang, Y., Liu, H. & Hu, Q. Transfuse: Fusing transformers and cnns for medical image segmentation. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, 14–24 (Springer, 2021).
Search in Google Scholar Back to article
Wang, Z. et al. Smeswin unet: Merging cnn and transformer for medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 517 – 526 (Springer, 2022).
Search in Google Scholar Back to article
Chen, Y. et al. Scunet++: Swin-unet and cnn bottleneck hybrid architecture with multi-fusion dense skip connection for pulmonary embolism ct image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 7759– 7767 (2024).
Search in Google Scholar Back to article
Hendrycks, D. & Gimpel, K. Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415 (2016).
Search in Google Scholar Back to article

PSwinUNet: Bridging Local and Global Contexts for Accurate Medical Image Segmentation with Semi-Supervised Learning

Abstract

Paradigm

My account