References
- S.A. P. Murray, The Library: An Illustrated History, Skyhorse Pub.: New York, NY, 2009.
- S. Lloyd, “Least squares quantization in PCM”, IEEE Transactions on Information Theory, vol. 28, no. 2, 1982, 129–137, 10.1109/TIT.1982.1056489.
- U. Luxburg von, “A tutorial on spectral clustering”, Statistics and Computing, vol. 17, no. 4, 2007, 395–416, https://doi.org/10.1007/s11222-007-9033-z.
- J. Wang and Y. Dong, “Measurement of text similarity: A survey”, Information, vol. 11, no. 9, 2020, 10.3390/info11090421.
- A. Ittoo, L. M. Nguyen, and A. van den Bosch, “Text analytics in industry”, Computers in Industry, vol. 78, no. C, 2016, 96–107, 10.1016/j.compind.2015.12.001.
- G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing”, Communications of the ACM, vol. 18, no. 11, 1975, 613–620, 10.1145/361219.361220.
- A. P. Sunita Bisht, “Document clustering: A review”, International Journal of Computer Applications, vol. 73, no. 11, 2013, 26–33, 10.5120/12787-0024.
- X. Rong. “word2vec parameter learning explained”, 2014. arXiv:1411.2738 [cs.CL].
- J. H. Lau and T. Baldwin, “An empirical evaluation of doc2vec with practical insights into document embedding generation”. In: Proceedings of the 1st Workshop on Representation Learning for NLP, Berlin, Germany, 2016, 78–86, https://doi.org/10.18653/v1/W16-1609.
- J. Pennington, R. Socher, and C. Manning, “GloVe: Global vectors for word representation”. In: A. Moschitti, B. Pang, and W. Daelemans, eds., Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, 1532–1543, 10.3115/v1/D14-1162.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding”, Computing Research Repository, vol. abs/1810.04805, 2018, https://doi.org/10.48550/arXiv.1810.04805.
- E. Sezerer and S. Tekir, “A survey on neural word embeddings”, Computing Research Repository, vol. abs/2110.01804, 2021.
- Y. Li and T. Yang. “Word embedding for understanding natural language: A survey”. In: S. Srinivasan, ed., Guide to Big Data Applications, 83–104. Springer International Publishing, Cham, 2018.
- L. da Costa, I. Oliveira, and R. Fileto, “Text classification using embeddings: a survey”, Knowledge and Information Systems, vol. 65, 2023, 2761–2803, https://doi.org/10.1007/s10115-023-01856-z.
- L. Stankevičius and M. Lukoševičius, “Extracting sentence embeddings from pretrained transformer models”, Computing Research Repository, vol. abs/2408.08073, 2024, https://doi.org/10.3390/app14198887.
- S. Talebi, E. Tong, A. Li, and et al., “Exploring the performance and explainability of fine-tuned bert models for neuroradiology protocol assignment”, BMC Medical Informatics and Decision Making, vol. 24, no. 40, 2024, 10.1186/s12911-024-02444-z.
- N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov, N. Kliushkina, C. Araya, S. Yan, and O. Reblitz-Richardson, “Captum: A unified and generic model interpretability library for pytorch”, Computing Research Repository, vol. abs/2009.07896, 2020, https://doi.org/10.48550/arXiv.2009.07896.
- M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks”. In: Proc. 34th International Conference on Machine Learning, vol. 70, 2017, 3319–28.
- N. O. Tan, J. Bensemann, D. Benavides-Prado, Y. Chen, M. Gahegan, L. Lee, A. Y. Peng, P. Riddle, and M. Witbrock, “An explainability analysis of a sentiment prediction task using a transformer-based attention filter”. In: Proceedings of the Ninth Annual Conference on Advances in Cognitive Systems, 2021.
- Z. Xu and Y Ke, “Effective and efficient spectral clustering on text and link data”. In: CIKM ’16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, 2016, 357–366, https://doi.org/10.1145/2983323.2983708.
- S. Wierzchoń and M. Kłopotek, Modern Clustering Algorithms, volume 34 of Studies in Big Data, Springer Verlag, 2018, 421.
- I. S. Dhillon, Y. Guan, and B. Kulis, “Kernel k-means: Spectral clustering and normalized cuts”. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 2004, 551–556, 10.1145/1014052.1014118.
- B. Starosta, M. A. Kłopotek, S. T. Wierzchoń, D. Czerski, M. Sydow, and P. Borkowski, “Explainable graph spectral clustering of text documents”, PLoS One, vol. 20(2):e0313238, 2025, 10.1371/journal.pone.0313238.
- C. R. Harris, K. J. Millman, S. J. van der Walt, et al., “Array programming with NumPy”, Nature, vol. 585, 2020, 357–362, 10.1038/s41586-020-2649-2.
- P. Virtanen, R. Gommers, T. E. Oliphant, et al., “SciPy 1.0: Fundamental algorithms for scientific computing in python”, Nature Methods, vol. 17, 2020, 261–272,10.1038/s41592-019-0686-2.
- L. Buitinck, G. Louppe, M. Blondel, et al., “API design for machine learning software: experiences from the scikit-learn project”, Computing Research Repository, vol. abs/1309.0238, 2013, https://doi.org/10.48550/arXiv.1309.0238.
- H. Kim and H. K. Kim. “clustering4docs github repository”, 2020. https://pypi.org/project/soyclustering/.
- H. Kim, H. K. Kim, and S. Cho, “Improving spherical k-means for document clustering: Fast initialization, sparse centroid projection, and efficient cluster labeling”, Expert Systems with Applications, vol. 150, 2020, 113288, https://doi.org/10.1016/j.eswa.2020.113288.
- H. W. Kuhn, “The hungarian method for the assignment problem”, Naval Research Logistics Quarterly, vol. 2, 1955, 83–97.
- D. Pfitzner, R. Leibbrandt, and D. Powers, “Characterization and evaluation of similarity measures for pairs of clusterings”, Knowledge and Information Systems, vol. 19, no. 3, 2009, 361–394, https://doi.org/10.1007/s10115-008-0150-6.
- E. Achtert, S. Goldhofer, H.-P. Kriegel, E. Schubert, and A. Zimek, “Evaluation of clusterings - metrics and visual support”. In: 2012 IEEE 28th International Conference on Data Engineering, 2012, 1285–1288, 10.1109/ICDE.2012.128.
- A. Balagopalan, H. Zhang, K. Hamidieh, T. Hartvigsen, F. Rudzicz, and M. Ghassemi, “The road to explainability is paved with bias: Measuring the fairness of explanations”. In: 2022 ACM Conference on Fairness Accountability and Transparency, 2022, 1194–1206, 10.1145/3531146.3533179.
- R. R. Hoffman, S. T. Mueller, G. Klein, and J. Litman, “Measures for explainable AI: Explanation goodness, user satisfaction, mental models, curiosity, trust, and human-ai performance”, Frontiers of Computer Science, Sec. Theoretical Computer Science, 06 February 2023, vol. 5, 2023, https://doi.org/10.3389/fcomp.2023.1096257.
- A. Holzinger, A. M. Carrington, and H. Muller, “Measuring the quality of explanations: The system causability scale (SCS). comparing human and machine explanations”, Computing Research Repository, vol. abs/1912.09024, 2019.
- F. Sovrano and F. Vitali, “An objective metric for explainable AI: how and why to estimate the degree of explainability”, Computing Research Repository, vol. abs/2109.05327, 2021.
- C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead”, Computing Research Repository, vol. abs/1811.10154, 2019.
- H. Löfström, K. Hammar, and U. Johansson. A Meta Survey of Quality Evaluation Criteria in Explanation Methods, 55–63. Springer International Publishing, 2022.
- J. Zhou, A. H. Gandomi, F. Chen, and A. Holzinger, “Evaluating the quality of machine learning explanations: A survey on methods and metrics”, Electronics, vol. 10, no. 5, 2021, 10.3390/electronics10050593.
- M. Tamajka. “How to measure the quality of explanations of AI predictions”. https://kinit.sk/how-to-measure-the-quality-of-explanations-of-ai-predictions/,2022.
- P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods”, Entropy, vol. 23, no. 1, 2021, 10.3390/e23010018.
- M. Baroni, G. Dinu, and G. Kruszewski, “Don’t count, predict! a systematic comparison of context-counting vs. context-predicting semantic vectors”. In: K. Toutanova and H. Wu, eds., Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, Maryland, 2014, 238–247, 10.3115/v1/P14-1023.
- M. A. Kłopotek, S. T. Wierzchoń, B. Starosta, D. Czerski, and P. Borkowski, “A method for handling negative similarities in explainable graph spectral clustering of text documents - Extended Version”, Computing Research Repository, vol. abs/2504.12360, 2025, https://doi.org/10.48550/arXiv.2504.12360.
- C. Rudin, “Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead”, Nature Machine Intelligence, vol. 1, no. 5, 2019, 206–215, 10.1038/S42256-019-0048-X.
- P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods”, Entropy 2021, vol. 23, no. 1, 2021, https://doi.org/10.3390/e23010018.
- S. Farquhar, J. Kossen, L. Kuhn, et al., “Detecting hallucinations in large language models using semantic entropy”, Nature, vol. 630, 2024, 625–630, https://doi.org/10.1038/s41586-024-07421-0.
