Skip to main content
Have a personal or library account? Click to login
Single-image indoor localization using cross-domain learning from BIM models Cover

Single-image indoor localization using cross-domain learning from BIM models

Open Access
|May 2026

References

  1. Acharya, D. (2020). Visual indoor localisation using a 3D building model. PhD thesis, The University of Melbourne.
  2. Acharya, D. and Khoshelham, K. (2023). Reverse domain adaptation for indoor camera pose regression. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, X-1/W1-2023:453–460, doi:10.5194/isprs-annals-X-1-W1-2023-453-2023.
  3. Acharya, D., Khoshelham, K., and Winter, S. (2019). BIMPoseNet: Indoor camera localisation using a 3D indoor model and deep learning from synthetic images. ISPRS Journal of Photogrammetry and Remote Sensing, 150:245–258, doi:10.1016/j.isprsjprs.2019.02.020.
  4. Acharya, D., Tatli, C. J., and Khoshelham, K. (2023). Synthetic-real image domain adaptation for indoor camera pose regression using a 3D model. ISPRS Journal of Photogrammetry and Remote Sensing, 202:405–421, doi:10.1016/j.isprsjprs.2023.06.013.
  5. Acharya, D., Tennakoon, R., Muthu, S., Khoshelham, K., Hoseinnezhad, R., and Bab-Hadiashar, A. (2022). Single-image localisation using 3D models: Combining hierarchical edge maps and semantic segmentation for domain adaptation. Automation in Construction, 136:104152, doi:10.1016/j.autcon.2022.104152.
  6. Agarwal, S., Snavely, N., Simon, I., Seitz, S. M., and Szeliski, R. (2009). Building Rome in a day. In 2009 IEEE 12th International Conference on Computer Vision, pages 72–79. doi:10.1109/ICCV.2009.5459148.
  7. Bach, T. B., Dinh, T. T., and Lee, J.-H. (2022). FeatLoc: Absolute pose regressor for indoor 2D sparse features with simplistic view synthesizing. ISPRS Journal of Photogrammetry and Remote Sensing, 189:50–62, doi:10.1016/j.isprsjprs.2022.04.021.
  8. Bay, H., Tuytelaars, T., and Van Gool, L. (2006). SURF: Speeded Up Robust Features. In Leonardis, A., Bischof, H., and Pinz, A., editors, Computer Vision – ECCV 2006, pages 404–417, Berlin, Heidelberg. Springer Berlin Heidelberg.
  9. Blanton, H. (2021). Revisiting Absolute Pose Regression. PhD thesis, University of Kentucky.
  10. Bousmalis, K., Silberman, N., Dohan, D., Erhan, D., and Krishnan, D. (2016). Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 95–104.
  11. Clark, R., Wang, S., Markham, A., Trigoni, N., and Wen, H. (2017). VidLoc: 6-DoF Video-Clip Relocalization. CoRR, abs/1702.06521.
  12. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. doi:10.1109/CVPR.2009.5206848.
  13. Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. (2020). An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.
  14. Dozat, T. (2016). Incorporating Nesterov Momentum into Adam. In Proceedings of the 4th International Conference on Learning Representations (ICLR) Workshop.
  15. Furukawa, Y. and Ponce, J. (2010). Accurate, Dense, and Robust Multiview Stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(8):1362–1376, doi:10.1109/TPAMI.2009.161.
  16. Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative Adversarial Networks.
  17. He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep Residual Learning for Image Recognition.
  18. Kendall, A. and Cipolla, R. (2016). Modelling uncertainty in deep learning for camera relocalization. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages 4762–4769. doi:10.1109/ICRA.2016.7487679.
  19. Kendall, A. and Cipolla, R. (2017). Geometric loss functions for camera pose regression with deep learning. CoRR, abs/1704.00390.
  20. Kendall, A., Grimes, M., and Cipolla, R. (2015). Convolutional networks for real-time 6-DOF camera relocalization. CoRR, abs/1505.07427.
  21. Li, M., Qin, J., Li, D., Chen, R., Liao, X., and Guo, B. (2021). VNLSTMPoseNet: A novel deep ConvNet for real-time 6-DOF camera relocalization in urban streets. Geo-spatial Information Science, 24(3):422–437, doi:10.1080/10095020.2021.1960779.
  22. Lowe, D. G. (2004). Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60:91–110, doi:10.1023/B:VISI.0000029664.99615.94.
  23. Nurutdinova, I. and Fitzgibbon, A. (2015). Towards Pointless Structure from Motion: 3D Reconstruction and Camera Parameters from General 3D Curves. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 2363–2371. doi:10.1109/ICCV.2015.272.
  24. Peng, X., Sun, B., Ali, K., and Saenko, K. (2015). Learning Deep Object Detectors from 3D Models.
  25. Sattler, T., Zhou, Q., Pollefeys, M., and Leal-Taixé, L. (2019). Understanding the Limitations of CNN-based Absolute Camera Pose Regression. CoRR, abs/1903.07504.
  26. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015). Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9. doi:10.1109/CVPR.2015.7298594.
  27. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. (2017). Attention Is All You Need.
  28. Walch, F., Hazirbas, C., Leal-Taixé, L., Sattler, T., Hilsenbeck, S., and Cremers, D. (2016). Image-based Localization with Spatial LSTMs. CoRR, abs/1611.07890.
  29. Yao, D., Zhu, H., Ren, B., and Zhuang, X. (2024). Improving single image localization through domain adaptation and large kernel attention with synthetic data. Engineering Applications of Artificial Intelligence, 137:108951, doi:10.1016/j.engappai.2024.108951.
  30. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., and Torralba, A. (2018). Places: A 10 Million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452–1464, doi:10.1109/TPAMI.2017.2723009.
  31. Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2020). Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.
DOI: https://doi.org/10.2478/rgg-2026-0004 | Journal eISSN: 2391-8152 | Journal ISSN: 0867-3179
Language: English
Page range: 50 - 58
Submitted on: Nov 27, 2025
Accepted on: Apr 4, 2026
Published on: May 6, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2026 Piotr Ryszko, Dorota Włodarczyk, Małgorzata Jarząbek-Rychard, published by Warsaw University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.