Abstract
This paper proposes CDH-YOLO, an efficient, real-time pedestrian detection model for nighttime RGB images. Built on YOLOv5, CDH-YOLO incorporates structural reparameterization to optimize the backbone network and integrates convolutional block attention module to enhance feature representation. Transposed convolution replaces nearest neighbor interpolation for upsampling to preserve semantic information. A lightweight decoupled head addresses spatial misalignment between classification and regression tasks, while SIoU loss improves training convergence and localization accuracy. Experiments on the KAIST dataset demonstrate that CDH-YOLO achieves superior accuracy with real-time performance, significantly outperforming existing methods in nighttime pedestrian detection.