Speedup of the k-Means Algorithm for Partitioning Large Datasets of Flat Points by a Preliminary Partition and Selecting Initial Centroids

Romanuke, Vadim

doi:10.2478/acss-2023-0001

References

W. Kaplan, “Maxima and minima with applications: Practical optimization and duality,” in Wiley Series in Discrete Mathematics and Optimization, vol. 51. John Wiley & Sons, 2011, p. 61.
Search in Google Scholar Back to article
K. A. Randolph and L. L. Myers, Basic Statistics in Multivariate Analysis, Pocket Guide to Social Work Research Methods. Oxford University Press, Oxford, England, 2013, p. 116.
Search in Google Scholar Back to article
S. Li, “A 1.488 approximation algorithm for the uncapacitated facility location problem,” in Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 6756. Springer, 2011, pp. 77–88. https://doi.org/10.1007/978-3-642-22012-8_5
Search in Google Scholar Back to article
N. Megiddo and A. Tamir, “On the complexity of locating linear facilities in the plane,” Operations Research Letters, vol. 1, no. 5, pp. 194–197, 1982. https://doi.org/10.1016/0167-6377(82)90039-6
Search in Google Scholar Back to article
T. F. Gonzalez, “Clustering to minimize the maximum intercluster distance,” Theoretical Computer Science, vol. 38, pp. 293–306, 1985. https://doi.org/10.1016/0304-3975(85)90224-5
Search in Google Scholar Back to article
A. Ahmadi-Javid, P. Seyedi, and S. Syam, “A survey of healthcare facility location,” Computers & Operations Research, vol. 79, pp. 223–263, Mar. 2017. https://doi.org/10.1016/j.cor.2016.05.018
Search in Google Scholar Back to article
J. A. Hartigan and M. A. Wong, “Algorithm AS 136: A k-Means clustering algorithm,” Journal of the Royal Statistical Society, Series C, vol. 28, no. 1, pp. 100–108, 1979. https://doi.org/10.2307/2346830
Search in Google Scholar Back to article
M. E. Celebi, H. A. Kingravi, and P. A. Vela, “A comparative study of efficient initialization methods for the k-means clustering algorithm,” Expert Systems with Applications, vol. 40, no. 1, pp. 200–210, Jan. 2013. https://doi.org/10.1016/j.eswa.2012.07.021
Search in Google Scholar Back to article
A. Vattani, “k-means requires exponentially many iterations even in the plane,” Discrete and Computational Geometry, vol. 45, no. 4, pp. 596–616, Mar. 2011. https://doi.org/10.1007/s00454-011-9340-1
Search in Google Scholar Back to article
M. Mahajan, P. Nimbhorkar, and K. Varadarajan, “The planar k-means problem is NP-hard,” in Lecture Notes in Computer Science, vol. 5431. Springer, 2009, pp. 274–285. https://doi.org/10.1007/978-3-642-00202-1_24
Search in Google Scholar Back to article
T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman, and A. Y. Wu, “An efficient k-means clustering algorithm: Analysis and implementation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881–892, Jul. 2002. https://doi.org/10.1109/TPAMI.2002.1017616
Search in Google Scholar Back to article
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, “Section 16.1. Gaussian mixture models and k-means clustering,” in Numerical Recipes: The Art of Scientific Computing, 3rd ed. Cambridge University Press, New York, NY, 2007.
Search in Google Scholar Back to article
V. V. Romanuke, “Optimization of a dataset for a machine learning task by clustering and selecting closest-to-the-centroid objects,” Herald of Khmelnytskyi National University. Technical Sciences, vol. 1, no. 6, pp. 263–265, 2018.
Search in Google Scholar Back to article
L. Bottou and Y. Bengio, “Convergence properties of the K-means algorithms,” in Proceedings of the 7th International Conference on Neural Information Processing Systems (NIPS’94), Jan. 1994, pp. 585–592.
Search in Google Scholar Back to article
V. V. Romanuke, “Evolution of expert competences in estimating a finite set of objects by a given comparison scale via pairwise comparison matrices within the space of positive inverse-symmetric matrices,” Herald of Khmelnytskyi National University. Technical Sciences, no. 2, pp. 25–29, 2016.
Search in Google Scholar Back to article
C. Darken and J. Moody, “Note on learning rate schedules for stochastic optimization,” in R. P. Lippman, R. Moody, and D. S. Touretzky (eds.), Advances in Neural Information Processing Systems 3 (NIPS 1990), Morgan Kaufmann, Palo Alto, Denver, CO, 1991, pp. 832–838.
Search in Google Scholar Back to article
J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proceedings of the Fifth Berkeley Symposium on Mathematics, Statistics and Probability, vol. 1, 1967, pp. 281–296.
Search in Google Scholar Back to article
R. Ostrovsky, Y. Rabani, L. J. Schulman, and C. Swamy, “The effectiveness of Lloyd-type methods for the k-means problem,” in Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), Berkeley, CA, USA, Oct. 2006, pp. 165–174. https://doi.org/10.1109/FOCS.2006.75
Search in Google Scholar Back to article
T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu, “A local search approximation algorithm for k-means clustering,” Computational Geometry: Theory and Applications, vol. 28, no. 2–3, pp. 89–112, Jun. 2004. https://doi.org/10.1016/j.comgeo.2004.03.003
Search in Google Scholar Back to article
A. Chakrabarty and D. Swagatam, “On strong consistency of kernel k-means: A Rademacher complexity approach,” Statistics & Probability Letters, vol. 182, 2022, Art. no. 109291. https://doi.org/10.1016/j.spl.2021.109291
Search in Google Scholar Back to article
A. M. Ikotun, A. E. Ezugwu, L. Abualigah, B. Abuhaija, and J. Heming, “K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data,” Information Sciences, vol. 622, pp. 178–210, Apr. 2023. https://doi.org/10.1016/j.ins.2022.11.139
Search in Google Scholar Back to article
S. J. Phillips, “Acceleration of K-Means and related clustering algorithms,” in D. M. Mount and C. Stein (eds.), Lecture Notes in Computer Science, vol. 2409. Springer, 2002, pp. 166–177. https://doi.org/10.1007/3-540-45643-0_13
Search in Google Scholar Back to article
G. Hamerly, “Making k-means even faster,” in Proceedings of the 2010 SIAM International Conference on Data Mining, 2010, pp. 130–140. https://doi.org/10.1137/1.9781611972801.12
Search in Google Scholar Back to article
D. Arthur and S. Vassilvitskii, “How slow is the k-means method?,” in Proceedings of the Twenty-Second Annual Symposium on Computational Geometry (SCG’06), Jun. 2006, pp. 144–153. https://doi.org/10.1145/1137856.1137880
Search in Google Scholar Back to article
P. Fränti and S. Sieranoja, “How much can k-means be improved by using better initialization and repeats?,” Pattern Recognition, vol. 93, pp. 95–112, Sep. 2019. https://doi.org/10.1016/j.patcog.2019.04.014
Search in Google Scholar Back to article
V. V. Romanuke, “Fast-and-smoother uplink power control algorithm based on distance ratios for wireless data transfer systems,” Studies in Informatics and Control, vol. 28, no. 2, pp. 147–156, 2019. https://doi.org/10.24846/v28i2y201903
Search in Google Scholar Back to article
B. Żak and S. Hożyń, “Local image features matching for real-time seabed tracking applications,” Journal of Marine Engineering & Technology, vol. 16, no. 4, pp. 273–282, Oct. 2017. https://doi.org/10.1080/20464177.2017.1386266
Search in Google Scholar Back to article
S. Mukherjee, M. Cajić, D. Karličić, and S. Adhikari, “Enhancement of band-gap characteristics in hexagonal and re-entrant lattices via curved beams,” Composite Structures, vol. 306, Feb. 2023, Art. no. 116591. https://doi.org/10.1016/j.compstruct.2022.116591
Search in Google Scholar Back to article
J. Dong and H. Fan, “Crushing behaviors of buckling-oriented hexagonal lattice structures,” Mechanics of Materials, vol. 165, Feb. 2022, Art. no. 104160. https://doi.org/10.1016/j.mechmat.2021.104160
Search in Google Scholar Back to article
M. I. Español, D. Golovaty, and J. P. Wilber, “A discrete-to-continuum model of weakly interacting incommensurate two-dimensional lattices: The hexagonal case,” Journal of the Mechanics and Physics of Solids, vol. 173, Apr. 2023, Art. no. 105229. https://doi.org/10.1016/j.jmps.2023.105229
Search in Google Scholar Back to article
Y. Nakata, M. Yoshida, K. Osawa, and N. Miyanaga, “Fabricating a regular hexagonal lattice structure by interference pattern of six femtosecond laser beams,” Applied Surface Science, vol. 417, pp. 69–72, Sep. 2017. https://doi.org/10.1016/j.apsusc.2017.03.236
Search in Google Scholar Back to article
J. Cartensen, “About hexagons,” Mathematical Spectrum, vol. 33, no. 2, pp. 37–40, 2000–2001.
Search in Google Scholar Back to article
M. J. Wenninger, Polyhedron Models. Cambridge University Press, New York, NY, 1974, p. 9.
Search in Google Scholar Back to article
R. G. Gallager, Stochastic Processes Theory for Applications. Cambridge University Press, New York, NY, 2013.
Search in Google Scholar Back to article
A. Papoulis, Probability, Random Variables and Stochastic Processes. McGraw Hill, New York, NY, 1991.
Search in Google Scholar Back to article
A. El Korchi and Y. Ghanou, “2D geometric shapes dataset – for machine learning and pattern recognition,” Data in Brief, vol. 32, Oct. 2020, Art. no. 106090. https://doi.org/10.1016/j.dib.2020.106090
Search in Google Scholar Back to article
O. N. Almasi and M. Rouhani, “A geometric-based data reduction approach for large low dimensional datasets: Delaunay triangulation in SVM algorithms,” Machine Learning with Applications, vol. 4, Jun. 2021, Art. no. 100025. https://doi.org/10.1016/j.mlwa.2021.100025
Search in Google Scholar Back to article
N. Joorabloo, M. Jalili, and Y. Ren, “Improved recommender systems by denoising ratings in highly sparse datasets through individual rating confidence,” Information Sciences, vol. 601, pp. 242–254, Jul. 2022. https://doi.org/10.1016/j.ins.2022.03.068
Search in Google Scholar Back to article
R. B. Arantes, G. Vogiatzis, and D. R. Faria, “Learning an augmentation strategy for sparse datasets,” Image and Vision Computing, vol. 117, Jan. 2022, Art. no. 104338. https://doi.org/10.1016/j.imavis.2021.104338
Search in Google Scholar Back to article

Speedup of the k-Means Algorithm for Partitioning Large Datasets of Flat Points by a Preliminary Partition and Selecting Initial Centroids

References

Paradigm

My account