Have a personal or library account? Click to login
Efficient Astronomical Data Condensation Using Approximate Nearest Neighbors Cover

Abstract

Extracting useful information from astronomical observations represents one of the most challenging tasks of data exploration. This is largely due to the volume of the data acquired using advanced observational tools. While other challenges typical for the class of big data problems (like data variety) are also present, the size of datasets represents the most significant obstacle in visualization and subsequent analysis. This paper studies an efficient data condensation algorithm aimed at providing its compact representation. It is based on fast nearest neighbor calculation using tree structures and parallel processing. In addition to that, the possibility of using approximate identification of neighbors, to even further improve the algorithm time performance, is also evaluated. The properties of the proposed approach, both in terms of performance and condensation quality, are experimentally assessed on astronomical datasets related to the GAIA mission. It is concluded that the introduced technique might serve as a scalable method of alleviating the problem of the dataset size.

DOI: https://doi.org/10.2478/amcs-2019-0034 | Journal eISSN: 2083-8492 | Journal ISSN: 1641-876X
Language: English
Page range: 467 - 476
Submitted on: Nov 11, 2018
Accepted on: Mar 18, 2019
Published on: Sep 28, 2019
Published by: University of Zielona Góra
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2019 Szymon Łukasik, Konrad Lalik, Piotr Sarna, Piotr A. Kowalski, Małgorzata Charytanowicz, Piotr Kulczycki, published by University of Zielona Góra
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.