Have a personal or library account? Click to login

A Multi-View Fuzzy Clustering Framework for Semantic-Rich Text Data Using SBERT and Ensemble Learning

Open Access
|Jun 2025

References

  1. Q. Chen, Y. Peng and Z. Lu, “BioSentVec: creating sentence embeddings for biomedical texts,” in 2019 IEEE International Conference on Healthcare Informatics (ICHI), Xi’an, China, Jun. 2019, pp. 1–5. https://doi.org/10.1109/ichi.2019.8904728
  2. Y. Liu, M. Liu, and X. Wang, “Towards semantically sensitive text clustering: A feature space modeling technology based on dimension extension,” PLoS ONE, vol. 10, no. 3, Mar. 2015, Art. no. e0117390. https://doi.org/10.1371/journal.pone.0117390
  3. V. Mehta, S. Bawa, and J. Singh, “WEClustering: word embeddings based text clustering technique for large datasets,” Complex & Intelligent Systems, vol. 7, no. 6, pp. 3211–3224, Sep. 2021. https://doi.org/10.1007/s40747-021-00512-9
  4. D. Zhukov, E. Andrianova, K. Otradnov, and L. Istratov, “Soft clustering method for text mining, with an opportunity to attribute them to different semantic groups,” ITM Web of Conferences, vol. 18, Apr. 2018, Art. no. 03004). https://doi.org/10.1051/itmconf/20181803004
  5. L. Fu, P. Lin, A. V. Vasilakos, and S. Wang, “An overview of recent multi-view clustering,” Neurocomputing, vol. 402, pp. 148–161, Aug. 2020. https://doi.org/10.1016/j.neucom.2020.02.104
  6. C. Yuan, Y. Zhu, Z. Zhong, W. Zheng, and X. Zhu, “Robust self-tuning multi-view clustering,” World Wide Web, vol. 25, pp. 489–512, Feb. 2022. https://doi.org/10.1007/s11280-021-00945-9
  7. A. Kumar, P. Rai, and H. Daume, “Co-regularized multi-view spectral clustering,” in NIPS’11 Proceedings of the 25th International Conference on Neural Information Processing Systems, Dec. 2011, pp. 1413–1421.
  8. K. Chaudhuri, S. M. Kakade, K. Livescu, and K. Sridharan, “Multi-view clustering via canonical correlation analysis,” in Proceedings of the 26th Annual International Conference on Machine Learning, Jun. 2009, pp. 129–136. https://doi.org/10.1145/1553374.1553391
  9. D. Greene and P. Cunningham, “A matrix factorization approach for integrating multiple data views,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2009, pp. 423–438. https://doi.org/10.1007/978-3-642-04180-8_45
  10. X. Xie and S. Sun, “Multi-view clustering ensembles,” in 2013 International Conference on Machine Learning and Cybernetics, vol. 1, Tianjin, China, Jul. 2013, pp. 51–56. https://doi.org/10.1109/ICMLC.2013.6890443
  11. Z. Xu and S. Sun, “An algorithm on multi-view Adaboost,” in International Conference on Neural Information Processing, 2010, pp. 355–362. https://doi.org/10.1007/978-3-642-17537-4_44
  12. V. Kumar and S. Minz, “Multi-view ensemble learning: an optimal feature set partitioning for high-dimensional data classification,” Knowledge and Information Systems, vol. 49, pp. 1–59, Sep. 2016. https://doi.org/10.1007/s10115-015-0875-y
  13. Z. Tao, H. Liu, S. Li, Z. Ding, Y. Fu, “From ensemble clustering to multi-view clustering,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI), 2017, pp. 2843–2849. https://doi.org/10.24963/ijcai.2017/396
  14. E. Hammami and R. Faiz, “Text clustering based on multi-view representations,” in CIRCLE’22: Conference of the Information Retrieval Communities in Europe, Jul. 2022. [Online]. Available: https://ceurws.org/Vol-3178/CIRCLE_2022_paper_32.pdf
  15. G. Cui, “Analysis on the country differences of CSR of multinational corporations based on fuzzy C-Means clustering,” Journal of Physics: Conference Series, vol. 1533, no. 2, Apr. 2020, Art. no. 022079. https://doi.org/10.1088/1742-6596/1533/2/022079
  16. J. C. Bezdek, “A convergence theorem for the fuzzy ISODATA clustering algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-2, no. 1, pp. 1–8, Jan. 1980. https://doi.org/10.1109/TPAMI.1980.4766964
  17. S. Subudhi and S. Panigrahi, “A hybrid mobile call fraud detection model using optimized fuzzy C-means clustering and group method of data handling-based network,” Vietnam Journal of Computer Science, vol. 5, pp. 205–217, May 2018. https://doi.org/10.1007/s40595-018-0116-x
  18. I. D. Mienye and Y. Sun, “A survey of ensemble learning: Concepts, algorithms, applications, and prospects,” IEEE Access, vol. 10, pp. 99129–99149, Sep. 2022. https://doi.org/10.1109/ACCESS.2022.3207287
  19. K. A. Nguyen, W. Chen, B. S. Lin, and U. Seeboonruang, “Comparison of ensemble machine learning methods for soil erosion pin measurements,” ISPRS International Journal of Geo -Information, vol. 10, no. 1, Jan. 2021, Art. no. 42. https://doi.org/10.3390/ijgi10010042
  20. S. Alelyani, “Stable bagging feature selection on medical data,” Journal of Big Data, vol. 8, 2021, Art. no. 11. https://doi.org/10.1186/s40537-020-00385-8
  21. L. Hickman, S. Thapa, L. Tay, M. Cao, and P. Srinivasan, “Text preprocessing for text mining in organizational research: Review and recommendations,” Organizational Research Methods, vol. 25, no. 1, pp. 114–146, 2022. https://doi.org/10.1177/1094428120971683
  22. C. Galli, C. Cusano, S. Guizzardi, N. Donos, and E. Calciolari, “Embeddings for efficient literature screening: A primer for life science investigators,” Metrics, vol. 1, no. 1, Sep. 2024, Art. no. 1. https://doi.org/10.3390/metrics1010001
  23. B. Asadi and R. Hajj, “Prediction of asphalt binder elastic recovery using tree-based ensemble bagging and boosting models,” Construction and Building Materials, vol. 410, Jan. 2024, Art. no. 134154. https://doi.org/10.1016/j.conbuildmat.2023.134154
  24. A. A. Wani, “Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions,” PeerJ Computer Science, vol. 10, Aug. 2024, Art. no. e2286. https://doi.org/10.7717/peerj-cs.2286
DOI: https://doi.org/10.2478/acss-2025-0011 | Journal eISSN: 2255-8691 | Journal ISSN: 2255-8683
Language: English
Page range: 91 - 97
Submitted on: Mar 2, 2025
Accepted on: May 15, 2025
Published on: Jun 5, 2025
Published by: Riga Technical University
In partnership with: Paradigm Publishing Services
Publication frequency: 1 times per year

© 2025 Nik Siti Madihah Nik Mangsor, Syerina Azlin Md Nasir, Shuzlina Abdul-Rahman, Rosmayati Mohemad, published by Riga Technical University
This work is licensed under the Creative Commons Attribution 4.0 License.