Skip to main content
Have a personal or library account? Click to login
Lightweight inception-UNet with attention mechanisms for semantic segmentation Cover

Lightweight inception-UNet with attention mechanisms for semantic segmentation

Open Access
|Apr 2026

References

  1. S. Hao, Y. Zhou, Y. Guo, A brief survey on semantic segmentation with deep learning, Neurocomputing 406 (2020) 302–321.
  2. A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez, P. Martinez-Gonzalez, J. Garcia-Rodriguez, A survey on deep learning techniques for image and video semantic segmentation, Applied Soft Computing 70 (2018) 41–65.
  3. X. Yuan, J. Shi, L. Gu, A review of deep learning methods for semantic segmentation of remote sensing imagery, Expert Systems with Applications 169 (2021) 114417.
  4. G. Lampropoulos, E. Keramopoulos, K. Diamantaras, Enhancing the functionality of augmented reality using deep learning, semantic web and knowledge graphs: A review, Visual Informatics 4 (1) (2020) 32–42.
  5. H. T. Nguyen, N. N. Truong, L. T. T. Pham, N. H. Pham, An approach using skeleton-based representations and neural networks for yoga pose recognition, Applied Computer Systems 30 (1) (2025) 75–84.
  6. K. Thyagharajan, G. Kalaiarasi, A review on near-duplicate detection of images using computer vision techniques, Archives of Computational Methods in Engineering 28 (2021) 897–916.
  7. R. Dhir, M. Ashok, S. Gite, et al., An overview of advances in image colorization using computer vision and deep learning techniques, Rev. Comput. Eng. Res 7 (2) (2020) 86–95.
  8. N. T. K. Son, N. H. Quynh, B. T. Minh, Refining graduation classification accuracy with synergistic deep learning models, Cybernetics and Information Technologies 25 (2) (2025).
  9. M. Grupac, G. Lăzăroiu, Image processing computational algorithms, sensory data mining techniques, and predictive customer analytics in the metaverse economy, Review of Contemporary Philosophy 21 (2022) 205–222.
  10. Y. Zhu, C. Yao, X. Bai, Scene text detection and recognition: Recent advances and future trends, Frontiers of Computer Science 10 (2016) 19–36.
  11. H. Greenspan, B. Van Ginneken, R. M. Summers, Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique, IEEE transactions on medical imaging 35 (5) (2016) 1153–1159.
  12. V. Naosekpam, N. Sahu, Text detection, recognition, and script identification in natural scene images: A review, International Journal of Multimedia Information Retrieval 11 (3) (2022) 291–314.
  13. O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18, Springer, 2015, pp. 234–241.
  14. A. Paszke, A. Chaurasia, S. Kim, E. Culurciello, Enet: A deep neural network architecture for real-time semantic segmentation, arXiv preprint arXiv:1606.02147 (2016).
  15. V. Badrinarayanan, A. Kendall, R. Cipolla, Segnet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE transactions on pattern analysis and machine intelligence 39 (12) (2017) 2481–2495.
  16. C. Peng, T. Tian, C. Chen, X. Guo, J. Ma, Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation, Neural Networks 137 (2021) 188–199.
  17. A. Abedalla, M. Abdullah, M. Al-Ayyoub, E. Benkhelifa, The 2st-unet for pneumothorax segmentation in chest x-rays using resnet34 as a backbone for u-net, arXiv preprint arXiv:2009.02805 (2020).
  18. Autorikshaw detection challenge, https://cvit.iiit.ac.in/autorickshaw_detection/, (Accessed on 01/10/2024).
  19. Autonue challenge 2019, https://cvit.iiit.ac.in/autonue2019/challenge/overview.php, (Accessed on 12/21/2023).
  20. Ct liver, https://www.kaggle.com/datasets/zxcv2022/digital-medical-images-for-download-resource/, (Accessed on 12/21/2023).
  21. W. Zhang, Z. Liu, L. Zhou, H. Leung, A. B. Chan, Martial arts, dancing and sports dataset: A challenging stereo and multi-view dataset for 3d human pose estimation, Image and Vision Computing 61 (2017) 22–39.
  22. Visual geometry group - university of oxford, https://www.robots.ox.ac.uk/~vgg/data/pets/, (Accessed on 12/21/2023).
  23. F. Yu, V. Koltun, Multi-scale context aggregation by dilated convolutions, arXiv preprint arXiv:1511.07122 (2015).
  24. W. Liu, A. Rabinovich, A. C. Berg, Parsenet: Looking wider to see better, arXiv preprint arXiv:1506.04579 (2015).
  25. G. Lin, C. Shen, A. Van Den Hengel, I. Reid, Efficient piecewise training of deep structured models for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3194–3203.
  26. H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, A. Agrawal, Context encoding for semantic segmentation, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7151–7160. doi:10.1109/CVPR.2018.00747.
  27. Y. Zhuang, F. Yang, L. Tao, C. Ma, Z. Zhang, Y. Li, H. Jia, X. Xie, W. Gao, Dense relation network: Learning consistent and context-aware representation for semantic image segmentation, in: 2018 25th IEEE international conference on image processing (ICIP), IEEE, 2018, pp. 3698–3702.
  28. H. Zhang, H. Zhang, C. Wang, J. Xie, Co-occurrent features in semantic segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 548–557.
  29. H. Ding, X. Jiang, B. Shuai, A. Q. Liu, G. Wang, Semantic correlation promoted shape-variant context for segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8885–8894.
  30. J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431–3440.
  31. Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, J. Liang, Unet++: A nested u-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, Springer, 2018, pp. 3–11.
  32. Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3d u-net: learning dense volumetric segmentation from sparse annotation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17–21, 2016, Proceedings, Part II 19, Springer, 2016, pp. 424–432.
  33. X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, P.-A. Heng, H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes, IEEE transactions on medical imaging 37 (12) (2018) 2663–2674.
  34. S. Shah, P. Ghosh, L. S. Davis, T. Goldstein, Stacked u-nets: a no-frills approach to natural image segmentation, arXiv preprint arXiv:1804.10343 (2018).
  35. T. M. Quan, D. G. C. Hildebrand, W.-K. Jeong, Fusionnet: A deep fully residual convolutional neural network for image segmentation in connectomics, Frontiers in Computer Science 3 (2021) 613981.
  36. B. Hariharan, P. Arbeláez, R. Girshick, J. Malik, Hypercolumns for object segmentation and fine-grained localization, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 447–456.
  37. G. Lin, A. Milan, C. Shen, I. Reid, Refinenet: Multi-path refinement networks for high-resolution semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1925–1934.
  38. V. Nekrasov, C. Shen, I. Reid, Light-weight refinenet for real-time semantic segmentation, arXiv preprint arXiv:1810.03272 (2018).
  39. T. Pohlen, A. Hermans, M. Mathias, B. Leibe, Full-resolution residual networks for semantic segmentation in street scenes, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4151–4160.
  40. H. Sak, A. Senior, F. Beaufays, Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition, arXiv preprint arXiv:1402.1128 (2014).
  41. R. Messina, J. Louradour, Segmentation-free handwritten chinese text recognition with LSTM-RNN, in: 2015 13th International conference on document analysis and recognition (icdar), IEEE, 2015, pp. 171–175.
  42. P. Pinheiro, R. Collobert, Recurrent convolutional neural networks for scene labeling, in: International conference on machine learning, PMLR, 2014, pp. 82–90.
  43. R. P. Poudel, P. Lamata, G. Montana, Recurrent fully convolutional neural networks for multi-slice mri cardiac segmentation, in: Reconstruction, Segmentation, and Analysis of Medical Images: First International Workshops, RAMBO 2016 and HVSMR 2016, Held in Conjunction with MICCAI 2016, Athens, Greece, October 17, 2016, Revised Selected Papers 1, Springer, 2017, pp. 83–94.
  44. W. Byeon, T. M. Breuel, F. Raue, M. Liwicki, Scene labeling with LSTM recurrent neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3547–3555.
  45. F. Visin, M. Ciccone, A. Romero, K. Kastner, K. Cho, Y. Bengio, M. Matteucci, A. Courville, Reseg: A recurrent neural network-based model for semantic segmentation, in: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2016, pp. 41–48.
  46. B. Shuai, Z. Zuo, B. Wang, G. Wang, Dag-recurrent neural networks for scene labeling, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 3620–3629.
  47. H. Fan, H. Ling, Dense recurrent neural networks for scene labeling, arXiv preprint arXiv:1801.06831 (2018).
  48. Q. Zhao, J. Liu, Y. Li, H. Zhang, Semantic segmentation with attention mechanism for remote sensing images, IEEE Transactions on Geoscience and Remote Sensing 60 (2021) 1–13.
  49. H. Noh, S. Hong, B. Han, Learning deconvolution network for semantic segmentation, in: Proceedings of the IEEE international conference on computer vision, 2015, pp. 1520–1528.
  50. V. Badrinarayanan, A. Kendall, R. C. SegNet, A deep convolutional encoder-decoder architecture for image segmentation, arXiv preprint arXiv:1511.00561 5 (2015).
  51. D. Fourure, R. Emonet, E. Fromont, D. Muselet, A. Tremeau, C. Wolf, Residual conv-deconv grid network for semantic segmentation, arXiv preprint arXiv:1707.07958 (2017).
  52. A. Kendall, V. Badrinarayanan, R. Cipolla, Bayesian segnet: Model uncertainty in deep convolutional encoder-decoder architectures for scene understanding, arXiv preprint arXiv:1511.02680 (2015).
  53. J. Fu, J. Liu, Y. Wang, J. Zhou, C. Wang, H. Lu, Stacked deconvolutional network for semantic segmentation, IEEE Transactions on Image Processing (2019).
  54. S. M. Sam, K. Kamardin, N. N. A. Sjarif, N. Mohamed, et al., Offline signature verification using deep learning convolutional neural network (CNN) architectures googlenet inception-v1 and inception-v3, Procedia Computer Science 161 (2019) 475–483.
  55. O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, et al., Attention u-net: Learning where to look for the pancreas, arXiv preprint arXiv:1804.03999 (2018).
  56. J. Lee, J. Choi, J. Mok, S. Yoon, Reducing information bottleneck for weakly supervised semantic segmentation, Advances in Neural Information Processing Systems 34 (2021) 27408–27421.
  57. Y. Kim, Y. Lee, M. Jeon, Imbalanced image classification with complement cross entropy, Pattern Recognition Letters 151 (2021) 33–40.
  58. Z. Zhang, M. Sabuncu, Generalized cross entropy loss for training deep neural networks with noisy labels, Advances in neural information processing systems 31 (2018).
  59. C. K. Dewa, et al., Suitable CNN weight initialization and activation function for Javanese vowels classification, Procedia computer science 144 (2018) 124–132.
  60. U. M. Khaire, R. Dhanalakshmi, High-dimensional microarray dataset classification using an improved adam optimizer (iadam), Journal of Ambient Intelligence and Humanized Computing 11 (11) (2020) 5187–5204.
  61. R. Meyes, M. Lu, C. W. de Puiseau, T. Meisen, Ablation studies in artificial neural networks, arXiv preprint arXiv:1901.08644 (2019).
  62. R. F. Woolson, Wilcoxon signed-rank test, Wiley encyclopedia of clinical trials (2007) 1–3.
Language: English
Submitted on: Jul 11, 2025
Published on: Apr 10, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Twinkle Tiwari, Mukesh Saraswat, published by Macquarie University, Australia
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.