Have a personal or library account? Click to login
Speedup of the k-Means Algorithm for Partitioning Large Datasets of Flat Points by a Preliminary Partition and Selecting Initial Centroids Cover

Speedup of the k-Means Algorithm for Partitioning Large Datasets of Flat Points by a Preliminary Partition and Selecting Initial Centroids

By: Vadim Romanuke  
Open Access
|Aug 2023

References

  1. W. Kaplan, “Maxima and minima with applications: Practical optimization and duality,” in Wiley Series in Discrete Mathematics and Optimization, vol. 51. John Wiley & Sons, 2011, p. 61.
  2. K. A. Randolph and L. L. Myers, Basic Statistics in Multivariate Analysis, Pocket Guide to Social Work Research Methods. Oxford University Press, Oxford, England, 2013, p. 116.
  3. S. Li, “A 1.488 approximation algorithm for the uncapacitated facility location problem,” in Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6756. Springer, 2011, pp. 77–88. https://doi.org/10.1007/978-3-642-22012-8_5
  4. N. Megiddo and A. Tamir, “On the complexity of locating linear facilities in the plane,” Operations Research Letters, vol. 1, no. 5, pp. 194–197, 1982. https://doi.org/10.1016/0167-6377(82)90039-6
  5. T. F. Gonzalez, “Clustering to minimize the maximum intercluster distance,” Theoretical Computer Science, vol. 38, pp. 293–306, 1985. https://doi.org/10.1016/0304-3975(85)90224-5
  6. A. Ahmadi-Javid, P. Seyedi, and S. Syam, “A survey of healthcare facility location,” Computers & Operations Research, vol. 79, pp. 223–263, Mar. 2017. https://doi.org/10.1016/j.cor.2016.05.018
  7. J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A k-Means clustering algorithm,” Journal of the Royal Statistical Society, Series C, vol. 28, no. 1, pp. 100–108, 1979. https://doi.org/10.2307/2346830
  8. M. E. Celebi, H. A. Kingravi, and P. A. Vela, “A comparative study of efficient initialization methods for the k-means clustering algorithm,” Expert Systems with Applications, vol. 40, no. 1, pp. 200–210, Jan. 2013. https://doi.org/10.1016/j.eswa.2012.07.021
  9. A. Vattani, “k-means requires exponentially many iterations even in the plane,” Discrete and Computational Geometry, vol. 45, no. 4, pp. 596–616, Mar. 2011. https://doi.org/10.1007/s00454-011-9340-1
  10. M. Mahajan, P. Nimbhorkar, and K. Varadarajan, “The planar k-means problem is NP-hard,” in Lecture Notes in Computer Science, vol. 5431. Springer, 2009, pp. 274–285. https://doi.org/10.1007/978-3-642-00202-1_24
  11. T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient k-means clustering algorithm: Analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, Jul. 2002. https://doi.org/10.1109/TPAMI.2002.1017616
  12. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, “Section 16.1. Gaussian mixture models and k-means clustering,” in Numerical Recipes: The Art of Scientific Computing, 3rd ed. Cambridge University Press, New York, NY, 2007.
  13. V. V. Romanuke, “Optimization of a dataset for a machine learning task by clustering and selecting closest-to-the-centroid objects,” Herald of Khmelnytskyi National University. Technical Sciences, vol. 1, no. 6, pp. 263–265, 2018.
  14. L. Bottou and Y. Bengio, “Convergence properties of the K-means algorithms,” in Proceedings of the 7th International Conference on Neural Information Processing Systems (NIPS’94), Jan. 1994, pp. 585–592.
  15. V. V. Romanuke, “Evolution of expert competences in estimating a finite set of objects by a given comparison scale via pairwise comparison matrices within the space of positive inverse-symmetric matrices,” Herald of Khmelnytskyi National University. Technical Sciences, no. 2, pp. 25–29, 2016.
  16. C. Darken and J. Moody, “Note on learning rate schedules for stochastic optimization,” in R. P. Lippman, R. Moody, and D. S. Touretzky (eds.), Advances in Neural Information Processing Systems 3 (NIPS 1990), Morgan Kaufmann, Palo Alto, Denver, CO, 1991, pp. 832–838.
  17. J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, vol. 1, 1967, pp. 281–296.
  18. R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy, “The effectiveness of Lloyd-type methods for the k-means problem,” in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), Berkeley, CA, USA, Oct. 2006, pp. 165–174. https://doi.org/10.1109/FOCS.2006.75
  19. T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, “A local search approximation algorithm for k-means clustering,” Computational Geometry: Theory and Applications, vol. 28, no. 2–3, pp. 89–112, Jun. 2004. https://doi.org/10.1016/j.comgeo.2004.03.003
  20. A. Chakrabarty and D. Swagatam, “On strong consistency of kernel k-means: A Rademacher complexity approach,” Statistics & Probability Letters, vol. 182, 2022, Art. no. 109291. https://doi.org/10.1016/j.spl.2021.109291
  21. A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Information Sciences, vol. 622, pp. 178–210, Apr. 2023. https://doi.org/10.1016/j.ins.2022.11.139
  22. S. J. Phillips, “Acceleration of K-Means and related clustering algorithms,” in D. M. Mount and C. Stein (eds.), Lecture Notes in Computer Science, vol. 2409. Springer, 2002, pp. 166–177. https://doi.org/10.1007/3-540-45643-0_13
  23. G. Hamerly, “Making k-means even faster,” in Proceedings of the 2010 SIAM International Conference on Data Mining, 2010, pp. 130–140. https://doi.org/10.1137/1.9781611972801.12
  24. D. Arthur and S. Vassilvitskii, “How slow is the k-means method?,” in Proceedings of the Twenty-Second Annual Symposium on Computational Geometry (SCG’06), Jun. 2006, pp. 144–153. https://doi.org/10.1145/1137856.1137880
  25. P. Fränti and S. Sieranoja, “How much can k-means be improved by using better initialization and repeats?,” Pattern Recognition, vol. 93, pp. 95–112, Sep. 2019. https://doi.org/10.1016/j.patcog.2019.04.014
  26. V. V. Romanuke, “Fast-and-smoother uplink power control algorithm based on distance ratios for wireless data transfer systems,” Studies in Informatics and Control, vol. 28, no. 2, pp. 147–156, 2019. https://doi.org/10.24846/v28i2y201903
  27. B. Żak and S. Hożyń, “Local image features matching for real-time seabed tracking applications,” Journal of Marine Engineering & Technology, vol. 16, no. 4, pp. 273–282, Oct. 2017. https://doi.org/10.1080/20464177.2017.1386266
  28. S. Mukherjee, M. Cajić, D. Karličić, and S. Adhikari, “Enhancement of band-gap characteristics in hexagonal and re-entrant lattices via curved beams,” Composite Structures, vol. 306, Feb. 2023, Art. no. 116591. https://doi.org/10.1016/j.compstruct.2022.116591
  29. J. Dong and H. Fan, “Crushing behaviors of buckling-oriented hexagonal lattice structures,” Mechanics of Materials, vol. 165, Feb. 2022, Art. no. 104160. https://doi.org/10.1016/j.mechmat.2021.104160
  30. M. I. Español, D. Golovaty, and J. P. Wilber, “A discrete-to-continuum model of weakly interacting incommensurate two-dimensional lattices: The hexagonal case,” Journal of the Mechanics and Physics of Solids, vol. 173, Apr. 2023, Art. no. 105229. https://doi.org/10.1016/j.jmps.2023.105229
  31. Y. Nakata, M. Yoshida, K. Osawa, and N. Miyanaga, “Fabricating a regular hexagonal lattice structure by interference pattern of six femtosecond laser beams,” Applied Surface Science, vol. 417, pp. 69–72, Sep. 2017. https://doi.org/10.1016/j.apsusc.2017.03.236
  32. J. Cartensen, “About hexagons,” Mathematical Spectrum, vol. 33, no. 2, pp. 37–40, 2000–2001.
  33. M. J. Wenninger, Polyhedron Models. Cambridge University Press, New York, NY, 1974, p. 9.
  34. R. G. Gallager, Stochastic Processes Theory for Applications. Cambridge University Press, New York, NY, 2013.
  35. A. Papoulis, Probability, Random Variables and Stochastic Processes. McGraw Hill, New York, NY, 1991.
  36. A. El Korchi and Y. Ghanou, “2D geometric shapes dataset – for machine learning and pattern recognition,” Data in Brief, vol. 32, Oct. 2020, Art. no. 106090. https://doi.org/10.1016/j.dib.2020.106090
  37. O. N. Almasi and M. Rouhani, “A geometric-based data reduction approach for large low dimensional datasets: Delaunay triangulation in SVM algorithms,” Machine Learning with Applications, vol. 4, Jun. 2021, Art. no. 100025. https://doi.org/10.1016/j.mlwa.2021.100025
  38. N. Joorabloo, M. Jalili, and Y. Ren, “Improved recommender systems by denoising ratings in highly sparse datasets through individual rating confidence,” Information Sciences, vol. 601, pp. 242–254, Jul. 2022. https://doi.org/10.1016/j.ins.2022.03.068
  39. R. B. Arantes, G. Vogiatzis, and D. R. Faria, “Learning an augmentation strategy for sparse datasets,” Image and Vision Computing, vol. 117, Jan. 2022, Art. no. 104338. https://doi.org/10.1016/j.imavis.2021.104338
DOI: https://doi.org/10.2478/acss-2023-0001 | Journal eISSN: 2255-8691 | Journal ISSN: 2255-8683
Language: English
Page range: 1 - 12
Published on: Aug 17, 2023
Published by: Riga Technical University
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2023 Vadim Romanuke, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.