Have a personal or library account? Click to login
Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds Cover

Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds

Open Access
|Aug 2025

Abstract

Semantic segmentation is important for robots navigating with 3D LiDARs, but the generation of training datasets requires tedious manual effort. In this paper, we introduce a set of strategies to efficiently generate large datasets by combining real and synthetic data samples. More specifically, the method populates recorded empty scenes with navigation-relevant obstacles generated synthetically, thus combining two domains: real life and synthetic. Our approach requires no manual annotation, no detailed knowledge about actual data feature distribution, and no real-life data of objects of interest. We validate the proposed method in the underground parking scenario and compare it with available open-source datasets. The experiments show superiority to the off-the-shelf datasets containing similar data characteristics but also highlight the difficulty of achieving the level of manually annotated datasets. We also show that combining generated and annotated data improves the performance visibly, especially for cases with rare occurrences of objects of interest. Our solution is suitable for direct application in robotic systems.

DOI: https://doi.org/10.2478/fcds-2025-0013 | Journal eISSN: 2300-3405 | Journal ISSN: 0867-6356
Language: English
Page range: 347 - 371
Submitted on: Dec 1, 2024
|
Accepted on: Jun 17, 2025
|
Published on: Aug 21, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2025 Konrad Cop, Bartosz Sułek, Tomasz Trzciński, published by Poznan University of Technology
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.