3D Object Localization has emerged as one of the pivotal challenges in Machine Vision tasks. In this paper, we proposed a novel 3D object localization method, leveraging a blend of deep learning techniques primarily rooted in object detection, post-image processing, and pose estimation algorithms. Our approach involves 3D calibration methods tailored for cost-effective industrial robotics systems, requiring only a single 2D image input. Initially, object detection is performed using the You Only Look Once (YOLO) model, followed by an R-CNN model for segmenting the object into two distinct parts, i.e., the top face and the remaining parts. Subsequently, the center of the top face serves as the initial positioning reference, refined through a novel calibration algorithm. Our experimental results indicate a significant enhancement in localization accuracy, showcasing the method’s efficacy in reducing localization errors broadly across various testing scenarios. We have also made the code and datasets openly accessible to the public at (https://github.com/NguyenCanhThanh/MonoCalibNet)
© 2025 Thanh Nguyen Canh, Du Trinh Ngoc, Xiem HoangVan, published by Łukasiewicz Research Network – Industrial Research Institute for Automation and Measurements PIAP
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.