Towards Explainable Graph Spectral Clustering for BERT Embeddings

Mieczysław A. Kłopotek; Sławomir T. Wierzchoń; Bartłomiej Starosta; Piotr Borkowski; Dariusz Czerski

doi:10.14313/jamris-2026-005

.blurhash-client-img { display: none !important; }

Towards Explainable Graph Spectral Clustering for BERT Embeddings

Journal of Automation, Mobile Robotics and Intelligent Systems

Volume 20 (2026): Issue 1 (March 2026)

By: Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, Bartłomiej Starosta, Piotr Borkowski and Dariusz Czerski

Open Access

|Mar 2026

S.A. P. Murray, The Library: An Illustrated History, Skyhorse Pub.: New York, NY, 2009.
Search in Google Scholar Back to article
S. Lloyd, “Least squares quantization in PCM”, IEEE Transactions on Information Theory, vol. 28, no. 2, 1982, 129–137, 10.1109/TIT.1982.1056489.
Open DOI Search in Google Scholar Back to article
U. Luxburg von, “A tutorial on spectral clustering”, Statistics and Computing, vol. 17, no. 4, 2007, 395–416, https://doi.org/10.1007/s11222-007-9033-z.
Search in Google Scholar Back to article
J. Wang and Y. Dong, “Measurement of text similarity: A survey”, Information, vol. 11, no. 9, 2020, 10.3390/info11090421.
Open DOI Search in Google Scholar Back to article
A. Ittoo, L. M. Nguyen, and A. van den Bosch, “Text analytics in industry”, Computers in Industry, vol. 78, no. C, 2016, 96–107, 10.1016/j.compind.2015.12.001.
Open DOI Search in Google Scholar Back to article
G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing”, Communications of the ACM, vol. 18, no. 11, 1975, 613–620, 10.1145/361219.361220.
Open DOI Search in Google Scholar Back to article
A. P. Sunita Bisht, “Document clustering: A review”, International Journal of Computer Applications, vol. 73, no. 11, 2013, 26–33, 10.5120/12787-0024.
Open DOI Search in Google Scholar Back to article
X. Rong. “word2vec parameter learning explained”, 2014. arXiv:1411.2738 [cs.CL].
Search in Google Scholar Back to article
J. H. Lau and T. Baldwin, “An empirical evaluation of doc2vec with practical insights into document embedding generation”. In: Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, Germany, 2016, 78–86, https://doi.org/10.18653/v1/W16-1609.
Search in Google Scholar Back to article
J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation”. In: A. Moschitti, B. Pang, and W. Daelemans, eds., Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, 1532–1543, 10.3115/v1/D14-1162.
Open DOI Search in Google Scholar Back to article
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding”, Computing Research Repository, vol. abs/1810.04805, 2018, https://doi.org/10.48550/arXiv.1810.04805.
Search in Google Scholar Back to article
E. Sezerer and S. Tekir, “A survey on neural word embeddings”, Computing Research Repository, vol. abs/2110.01804, 2021.
Search in Google Scholar Back to article
Y. Li and T. Yang. “Word embedding for understanding natural language: A survey”. In: S. Srinivasan, ed., Guide to Big Data Applications, 83–104. Springer International Publishing, Cham, 2018.
Search in Google Scholar Back to article
L. da Costa, I. Oliveira, and R. Fileto, “Text classification using embeddings: a survey”, Knowledge and Information Systems, vol. 65, 2023, 2761–2803, https://doi.org/10.1007/s10115-023-01856-z.
Search in Google Scholar Back to article
L. Stankevičius and M. Lukoševičius, “Extracting sentence embeddings from pretrained transformer models”, Computing Research Repository, vol. abs/2408.08073, 2024, https://doi.org/10.3390/app14198887.
Search in Google Scholar Back to article
S. Talebi, E. Tong, A. Li, and et al., “Exploring the performance and explainability of fine-tuned bert models for neuroradiology protocol assignment”, BMC Medical Informatics and Decision Making, vol. 24, no. 40, 2024, 10.1186/s12911-024-02444-z.
Open DOI Search in Google Scholar Back to article
N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov, N. Kliushkina, C. Araya, S. Yan, and O. Reblitz-Richardson, “Captum: A unified and generic model interpretability library for pytorch”, Computing Research Repository, vol. abs/2009.07896, 2020, https://doi.org/10.48550/arXiv.2009.07896.
Search in Google Scholar Back to article
M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks”. In: Proc. 34th International Conference on Machine Learning, vol. 70, 2017, 3319–28.
Search in Google Scholar Back to article
N. O. Tan, J. Bensemann, D. Benavides-Prado, Y. Chen, M. Gahegan, L. Lee, A. Y. Peng, P. Riddle, and M. Witbrock, “An explainability analysis of a sentiment prediction task using a transformer-based attention filter”. In: Proceedings of the Ninth Annual Conference on Advances in Cognitive Systems, 2021.
Search in Google Scholar Back to article
Z. Xu and Y Ke, “Effective and efficient spectral clustering on text and link data”. In: CIKM ’16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, 2016, 357–366, https://doi.org/10.1145/2983323.2983708.
Search in Google Scholar Back to article
S. Wierzchoń and M. Kłopotek, Modern Clustering Algorithms, volume 34 of Studies in Big Data, Springer Verlag, 2018, 421.
Search in Google Scholar Back to article
I. S. Dhillon, Y. Guan, and B. Kulis, “Kernel k-means: Spectral clustering and normalized cuts”. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2004, 551–556, 10.1145/1014052.1014118.
Open DOI Search in Google Scholar Back to article
B. Starosta, M. A. Kłopotek, S. T. Wierzchoń, D. Czerski, M. Sydow, and P. Borkowski, “Explainable graph spectral clustering of text documents”, PLoS One, vol. 20(2):e0313238, 2025, 10.1371/journal.pone.0313238.
Open DOI Search in Google Scholar Back to article
C. R. Harris, K. J. Millman, S. J. van der Walt, et al., “Array programming with NumPy”, Nature, vol. 585, 2020, 357–362, 10.1038/s41586-020-2649-2.
Open DOI Search in Google Scholar Back to article
P. Virtanen, R. Gommers, T. E. Oliphant, et al., “SciPy 1.0: Fundamental algorithms for scientific computing in python”, Nature Methods, vol. 17, 2020, 261–272,10.1038/s41592-019-0686-2.
Open DOI Search in Google Scholar Back to article
L. Buitinck, G. Louppe, M. Blondel, et al., “API design for machine learning software: experiences from the scikit-learn project”, Computing Research Repository, vol. abs/1309.0238, 2013, https://doi.org/10.48550/arXiv.1309.0238.
Search in Google Scholar Back to article
H. Kim and H. K. Kim. “clustering4docs github repository”, 2020. https://pypi.org/project/soyclustering/.
Search in Google Scholar Back to article
H. Kim, H. K. Kim, and S. Cho, “Improving spherical k-means for document clustering: Fast initialization, sparse centroid projection, and efficient cluster labeling”, Expert Systems with Applications, vol. 150, 2020, 113288, https://doi.org/10.1016/j.eswa.2020.113288.
Search in Google Scholar Back to article
H. W. Kuhn, “The hungarian method for the assignment problem”, Naval Research Logistics Quarterly, vol. 2, 1955, 83–97.
Search in Google Scholar Back to article
D. Pfitzner, R. Leibbrandt, and D. Powers, “Characterization and evaluation of similarity measures for pairs of clusterings”, Knowledge and Information Systems, vol. 19, no. 3, 2009, 361–394, https://doi.org/10.1007/s10115-008-0150-6.
Search in Google Scholar Back to article
E. Achtert, S. Goldhofer, H.-P. Kriegel, E. Schubert, and A. Zimek, “Evaluation of clusterings - metrics and visual support”. In: 2012 IEEE 28th International Conference on Data Engineering, 2012, 1285–1288, 10.1109/ICDE.2012.128.
Open DOI Search in Google Scholar Back to article
A. Balagopalan, H. Zhang, K. Hamidieh, T. Hartvigsen, F. Rudzicz, and M. Ghassemi, “The road to explainability is paved with bias: Measuring the fairness of explanations”. In: 2022 ACM Conference on Fairness Accountability and Transparency, 2022, 1194–1206, 10.1145/3531146.3533179.
Open DOI Search in Google Scholar Back to article
R. R. Hoffman, S. T. Mueller, G. Klein, and J. Litman, “Measures for explainable AI: Explanation goodness, user satisfaction, mental models, curiosity, trust, and human-ai performance”, Frontiers of Computer Science, Sec. Theoretical Computer Science, 06 February 2023, vol. 5, 2023, https://doi.org/10.3389/fcomp.2023.1096257.
Search in Google Scholar Back to article
A. Holzinger, A. M. Carrington, and H. Muller, “Measuring the quality of explanations: The system causability scale (SCS). comparing human and machine explanations”, Computing Research Repository, vol. abs/1912.09024, 2019.
Search in Google Scholar Back to article
F. Sovrano and F. Vitali, “An objective metric for explainable AI: how and why to estimate the degree of explainability”, Computing Research Repository, vol. abs/2109.05327, 2021.
Search in Google Scholar Back to article
C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead”, Computing Research Repository, vol. abs/1811.10154, 2019.
Search in Google Scholar Back to article
H. Löfström, K. Hammar, and U. Johansson. A Meta Survey of Quality Evaluation Criteria in Explanation Methods, 55–63. Springer International Publishing, 2022.
Search in Google Scholar Back to article
J. Zhou, A. H. Gandomi, F. Chen, and A. Holzinger, “Evaluating the quality of machine learning explanations: A survey on methods and metrics”, Electronics, vol. 10, no. 5, 2021, 10.3390/electronics10050593.
Open DOI Search in Google Scholar Back to article
M. Tamajka. “How to measure the quality of explanations of AI predictions”. https://kinit.sk/how-to-measure-the-quality-of-explanations-of-ai-predictions/,2022.
Search in Google Scholar Back to article
P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods”, Entropy, vol. 23, no. 1, 2021, 10.3390/e23010018.
Open DOI Search in Google Scholar Back to article
M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors”. In: K. Toutanova and H. Wu, eds., Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, 2014, 238–247, 10.3115/v1/P14-1023.
Open DOI Search in Google Scholar Back to article
M. A. Kłopotek, S. T. Wierzchoń, B. Starosta, D. Czerski, and P. Borkowski, “A method for handling negative similarities in explainable graph spectral clustering of text documents - Extended Version”, Computing Research Repository, vol. abs/2504.12360, 2025, https://doi.org/10.48550/arXiv.2504.12360.
Search in Google Scholar Back to article
C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead”, Nature Machine Intelligence, vol. 1, no. 5, 2019, 206–215, 10.1038/S42256-019-0048-X.
Open DOI Search in Google Scholar Back to article
P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods”, Entropy 2021, vol. 23, no. 1, 2021, https://doi.org/10.3390/e23010018.
Search in Google Scholar Back to article
S. Farquhar, J. Kossen, L. Kuhn, et al., “Detecting hallucinations in large language models using semantic entropy”, Nature, vol. 630, 2024, 625–630, https://doi.org/10.1038/s41586-024-07421-0.
Search in Google Scholar Back to article

Authors

Metrics

Articles in this issue

DOI: https://doi.org/10.14313/jamris-2026-005 | Journal eISSN: 2080-2145 | Journal ISSN: 1897-8649

Journal RSS Feed

Language: English

Page range: 53 - 65

Submitted on: Jul 10, 2025

Accepted on: Aug 10, 2025

Published on: Mar 31, 2026

Published by: Łukasiewicz Research Network – Industrial Research Institute for Automation and Measurements PIAP

In partnership with: Paradigm Publishing Services

Publication frequency: 4 issues per year

Keywords:

Explainable Machine Learning,

Natural Language Processing,

Graph Spectral Clustering,

Document Embedding versus Explainability,

BERT and GloVe and TVS Embedding

Related subjects:

Computer sciences,

Artificial intelligence,

Engineering,

Electrical engineering,

Control engineering, metrology and testing,

Mechanical engineering,

Fundamentals of mechanical engineering

© 2026 Mieczysław A. Kłopotek, Sławomir T. Wierzchoń, Bartłomiej Starosta, Piotr Borkowski, Dariusz Czerski, published by Łukasiewicz Research Network – Industrial Research Institute for Automation and Measurements PIAP
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Volume 20 (2026): Issue 1 (March 2026)