References
- K. Min, G.-H. Lee, S.-W. Lee, Attentional feature pyramid network for small object detection, Neural Networks, 155, 2022, 439–450.
- C. Zhang, Q. Gao, R. Shi, M. Yue, LDHDNet: a lightweight network with double branch head for feature enhancement of UAV targets in complex scenes, International Journal of Intelligent Systems, 2024, 7259029.
- J. Wei, S. Su, Z. Zhao, et al., Infrared pedestrian detection using improved UNet and YOLO through sharing visible light domain information, Measurement, 221, 2023, 113442.
- H. Shang, L. Sun, W. Qin, Pedestrian detection at night based on infrared camera and millimeter wave radar fusion, Journal of Sensor Technology, 34, 2021, 1137–1145.
- Y. Xue, Z. Ju, Y. Li, W. Zhang, MAF-YOLO: multi-modal attention fusion based YOLO for pedestrian detection, Infrared Physics & Technology, 118, 2021, 103906.
- L.-J. Liu, Y. Zhang, H. R. Karimi, Resilient machine learning for steel surface defect detection based on lightweight convolution, International Journal of Advanced Manufacturing Technology, 134, 2024, 4639–4650.
- L.-J. Liu, S.-Q. Sun, H.R. Karimi, A real-time surface defect detection model based on adaptive feature information selection and fusion, Information Fusion, 129, 2026.
- L. Zhang, L. Zhong, et al., Knowledge-guided multi-task attention network for survival risk prediction using multi-center computed tomography images, Neural Networks, 152, 2022, 394–406.
- J. Hu, L. Shen, S. Albanie, G. Sun, E. Wu, Squeeze-and-excitation networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2017, 2011–2023.
- S. Woo, J. Park, J.-Y. Lee, I. S. Kweon, CBAM: convolutional block attention module, Proceedings of the European Conference on Computer Vision (ECCV), 2018, 3–19.
- N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, End-to-end object detection with transformers, European Conference on Computer Vision, 2020, 213–229.
- Y. Cao, J. Xu, S. Lin, F. Wei, H. Hu, GCNet: non-local networks meet squeeze-excitation networks and beyond, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), 2019, 1971–1980.
- X. Ma, K. Hu, X. Sun, S. Chen, Adaptive attention module for image recognition systems in autonomous driving, International Journal of Intelligent Systems, 2024, 3934270.
- L.-J. Liu, Y. Zhang, H.R.Karimi, Defect detection of printed circuit board surface based on an improved YOLOv8 with FasterNet backbone algorithms. SIViP 19, 89 (2025).
- L.-J. Liu, S.-Q. Sun, H.R. Karimi, A real-time surface defect detection model based on adaptive feature information selection and fusion, Information Fusion, 129, 2026.
- P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. LeCun, OverFeat: integrated recognition, localization and detection using convolutional networks, CoRR, 2013.
- W. Liu, D. Anguelov, D. Erhan, et al., SSD: single shot MultiBox detector, European Conference on Computer Vision, 2016.
- J. Redmon, S. K. Divvala, R. B. Girshick, A. Farhadi, You only look once: unified, real-time object detection, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 779–788.
- R. B. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2013, 580–587.
- R. Girshick, Fast R-CNN, 2015 IEEE International Conference on Computer Vision (ICCV), 2015, 1440–1448.
- K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015.
- K. He, G. Gkioxari, P. Dollár, R. B. Girshick, Mask R-CNN, IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2017, 386–397.
- S. Ren, K. He, R. B. Girshick, J. Sun, Faster RCNN: towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2015, 1137–1149.
- J. Redmon, A. Farhadi, YOLO9000: better, faster, stronger, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, 6517–6525.
- J. Redmon, YOLOv3: an incremental improvement, ArXiv, 2018.
- A. Bochkovskiy, C.-Y. Wang, H.-Y. M. Liao, YOLOv4: optimal speed and accuracy of object detection, ArXiv, 2020.
- G. Jocher, YOLOv5 by Ultralytics, Available at https://github.com/ultralytics/yolov5, 2022.
- C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, et al., YOLOv6: a single-stage object detection framework for industrial applications, arXiv preprint arXiv:2209.02976, 2022.
- C.-Y. Wang, A. Bochkovskiy, H.-Y. M. Liao, YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 7464–7475.
- W. Zhang, Z. Hong, L. Xiong, Z. Zeng, Z. Cai, K. Tan, Sinextnet: a new small object detection model for aerial images based on PP-YOLOE, Journal of Artificial Intelligence and Soft Computing Research, 14, 2024.
- X. Ji, J. Chang, Y. Ji, Adaptive separation fusion: a novel downsampling approach in CNNs, Journal of Artificial Intelligence and Soft Computing Research, 15, 2025.
- T.-Y. Lin, P. Dollár, R. B. Girshick, K. He, S. Belongie, et al., Feature pyramid networks for object detection, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, 936–944.
- S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path aggregation network for instance segmentation, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 8759–8768.
- D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly learning to align and translate, CoRR, 2014.
- K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. S. Zemel, Y. Bengio, Show, attend and tell: neural image caption generation with visual attention, CoRR, 2015.
- K. Gregor, I. Danihelka, A. Graves, D. J. Rezende, et al., DRAW: a recurrent neural network for image generation, ArXiv, 2015.
- M. Jaderberg, K. Simonyan, A. Zisserman, et al., Spatial transformer networks, Advances in Neural Information Processing Systems, 28, 2015.
- J. Xu, Y. Cai, X. Wu, X. Lei, Q. Huang, H. F. Leung, Q. Li, Incorporating context-relevant concepts into convolutional neural networks for short text classification, Neurocomputing, 386, 2020, 42–53.
- Y. Cai, Q. Huang, Z. Lin, J. Xu, et al., Recurrent neural network with pooling operation and attention mechanism for sentiment analysis: a multi-task learning approach, Knowledge-Based Systems, 203, 2020, 1–12.
- T. Hussain, W.-C. Wang, M. Gogate, K. Dashtipour, Y. Tsao, X. Lu, A. Ahsan, A. Hussain, A novel temporal attentive-pooling based convolutional recurrent architecture for acoustic signal enhancement, IEEE Transactions on Artificial Intelligence, 3, 2022, 833–842.
- M. Nawaz, T. Nazir, A. Javed, M. F. Masood, J. Rashid, J. Kim, A. Hussain, A robust deep learning approach for tomato plant leaf disease localization and classification, Scientific Reports, 12, 2022.
- K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 770–778.
- A. Veit, M. J. Wilber, S. Belongie, Residual networks behave like ensembles of relatively shallow networks, Advances in Neural Information Processing Systems, 29, 2016.
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, et al., Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, 1–9.
- S. Ioffe, C. Szegedy, Batch normalization: accelerating deep network training by reducing internal covariate shift, ArXiv, 2015.
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, et al., Rethinking the inception architecture for computer vision, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, 2818–2826.
- C. Szegedy, S. Ioffe, V. Vanhoucke, A. A. Alemi, Inception-v4, Inception-ResNet and the impact of residual connections on learning, ArXiv, 2016.
- X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, J. Sun, RepVGG: making VGG-style ConvNets great again, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, 13728–13737.
- G. Song, Y. Liu, X. Wang, Revisiting the sibling head in object detector, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 11563–11572.
- Z. Ge, S. Liu, F. Wang, Z. Li, J. Sun, YOLOX: exceeding YOLO series in 2021, ArXiv, 2021.
- Z. Gevorgyan, SIoU loss: more powerful learning for bounding box regression, ArXiv, 2022.
- S. Hwang, J. Park, N. Kim, Y. Choi, I. S. Kweon, Multispectral pedestrian detection: benchmark dataset and baseline, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, 1037–1045.
- Y. Cao, C. Li, Y. Peng, Night pedestrian detection algorithm based on improved YOLOv7, Changjiang Information & Communications, 35, 2023, 57–60.
- Z. He, G. Chen, J. Chen, Y. Zhang, et al., Multi-scale feature fusion lightweight real-time infrared pedestrian detection at night, Chinese Journal of Lasers, 49, 2023, 1709002.