Skip to main content
Have a personal or library account? Click to login
AI-Driven Peer Company Identification: A Semantic Text-Similarity Approach Beyond Traditional Industry Classification Systems Cover

AI-Driven Peer Company Identification: A Semantic Text-Similarity Approach Beyond Traditional Industry Classification Systems

Open Access
|May 2026

References

  1. Banerjee, K., Prasad C, V., Gupta, R. R., Vyas, K., Anushree, H., & Mishra, B. (2020). Exploring Alternatives to Softmax Function. https://doi.org/10.48550/arXiv.2011.11538
  2. Bonne, G., Lo, A. W., Prabhakaran, A., Siah, K. W., Singh, M., Wang, X., Zangari, P., & Zhang, H. (2022). An Artificial Intelligence-Based Industry Peer Grouping System. Journal of Financial Data Science, 4(2), 9–36. https://doi.org/10.3905/jfds.2022.1.090
  3. Car, T., Šimac, I., & Šuman, S. (2025). Framing prompts as user stories: Effects on the output quality of generative AI. ENTRENOVA - ENTerprise REsearch InNOVAtion, 11(1). https://doi.org/10.54820/entrenova-2025-0057
  4. Caelen, O. (2017). A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence, 81(3–4), 429–450. https://doi.org/10.1007/s10472-017-9564-8
  5. Chang, Y., Kong, L., Jia, K., & Meng, Q. (2021). Chinese named entity recognition method based on BERT. Proceedings of 2021 IEEE International Conference on Data Science and Computer Application, ICDSCA 2021, 294–299. https://doi.org/10.1109/ICDSCA53499.2021.9650256
  6. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 16. https://doi.org/10.48550/arXiv.1810.04805
  7. Ding, K., Peng, X., & Wang, Y. (2019). A machine learning-based peer selection method with financial ratios. Accounting Horizons, 33(3), 75–87. https://doi.org/10.2308/acch-52454
  8. Eaton, G. W., Guo, F., Liu, T., & Officer, M. S. (2022). Peer selection and valuation in mergers and acquisitions. Journal of Financial Economics, 146(1), 230–255. https://doi.org/10.1016/j.jfineco.2021.09.006
  9. Fildor, D., & Pejić Bach, M. (2023). Testing the ability of ChatGPT to categorise urgent and non-urgent patient conditions: Who ya gonna call? ENTRENOVA - ENTerprise REsearch InNOVAtion Journal, 9(1), 101–112. https://doi.org/10.54820/entrenova-2023-0010
  10. Gordon, M., & Kochen, M. (1989). Recall-precision trade-off: A derivation. Journal of the American Society for Information Science, 40(3), 145–151. https://doi.org/10.1002/(SICI)1097-4571(198905)40:3<145::AID-ASI1>3.0.CO;2-I
  11. Goutte, C., & Gaussier, E. (2005). A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. Lecture Notes in Computer Science, 3408, 345–359. https://doi.org/10.1007/978-3-540-31865-1_25
  12. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778.
  13. Hurtik, P., Tomasiello, S., Hula, J., & Hynar, D. (2022). Binary cross-entropy with dynamical clipping. Neural Computing and Applications, 34(14), 12029–12041. https://doi.org/10.1007/s00521-022-07091-x
  14. Husmann, S., Shivarova, A., & Steinert, R. (2022). Company classification using machine learning. Expert Systems with Applications, 195. https://doi.org/10.1016/j.eswa.2022.116598
  15. Jagrič, T., & Herman, A. (2024). AI Model for Industry Classification Based on Website Data. Information, 19. https://doi.org/10.3390/info15020089
  16. Kingma, D. P., & Ba, J. (2017). Adam: A Method for Stochastic Optimization. 15. https://doi.org/10.48550/arXiv.1412.6980
  17. Krislock, N., & Wolkowicz, H. (2012). Euclidean distance matrices and applications. International Series in Operations Research and Management Science, 166, 879–914. https://doi.org/10.1007/978-1-4614-0769-0_30
  18. Lei Ba, J., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. 14. https://doi.org/https://doi.org/10.48550/arXiv.1607.06450
  19. Makridakis, S. (2017). The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms. Futures, 90, 46–60. https://doi.org/10.1016/j.futures.2017.03.006
  20. Mao, A., Mohri, M., & Zhong, Y. (2023). Cross-Entropy Loss Functions: Theoretical Analysis and Applications. 40th International Conference on Machine Learning, 23803–23828.
  21. Momeni, M., Mohseni, M., & Soofi, M. (2015). Clustering Stock Market Companies via K-Means Algorithm. Kuwait Chapter of Arabian Journal of Business and Management Review, 4(5), 1–10. https://doi.org/10.12816/0018959
  22. Noels, S., De Ridder, S., Viaene, S., & De Bie, T. (2023). An efficient graph-based peer selection method for financial statements. Intelligent Systems in Accounting, Finance and Management, 30(3), 120–136. https://doi.org/10.1002/isaf.1539
  23. Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. 30th International Conference on Machine Learning, ICML 2013, PART 3, 2347–2355.
  24. Phillips, R. L., & Ormsby, R. (2016). Industry classification schemes: An analysis and review. Journal of Business and Finance Librarianship, 21(1), 1–25. https://doi.org/10.1080/08963568.2015.1110229
  25. Pejić Bach, M., Krstić, Ž., Seljan, S., & Turulja, L. (2020). Text mining for big data analysis in financial sector: A literature review. Sustainability, 12(3), 1155. https://doi.org/10.3390/su12031155
  26. Puvvala, C. (2019). Company classification. Retrieved from: https://www.kaggle.com/datasets/charanpuvvala/company-classification/data
  27. Qu, C., Yang, L., Qiu, M., Bruce Croft, W., Zhang, Y., & Iyyer, M. (2019). BERT with history answer embedding for conversational question answering. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1133–1136. https://doi.org/10.1145/3331184.3331341
  28. Rahutomo, F., Kitasuka, T., & Aritsugi, M. (2012). Semantic Cosine Similarity. The 7th International Student Conference on Advanced Science and Technology. Retrieved from https://www.researchgate.net/profile/Faisal-Rahutomo/publication/262525676_Semantic_Cosine_Similarity/links/0a85e537ee3b675c1e000000/Semantic-Cosine-Similarity.pdf
  29. Rojas, R. (1996). The Backpropagation Algorithm. Neural Networks, 149–182. https://doi.org/10.1007/978-3-642-61068-4_7
  30. Ruby, U. A., Theerthagiri, P., Jacob, J. I., Vamsidhar, Y. (2020). Binary cross entropy with deep learning technique for Image classification. International Journal of Advanced Trends in Computer Science and Engineering, 9(4), 5393–5397. https://doi.org/10.30534/ijatcse/2020/175942020
  31. Santini, S., & Jain, R. (1999). Similarity measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9), 871–883. https://doi.org/10.1109/34.790428
  32. Shala Riza, L., Abazi Bexheti, L., & Zoroja, J. (2025). Early identification of at-risk students in online education: A deep learning approach to predictive modelling. Business Systems Research, 16(2), 69–91. https://doi.org/10.2478/bsrj-2025-001
  33. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., N., G., Kaiser, Ł., & Polosukhin, I. (2023). Attention Is All You Need. 15. https://doi.org/10.48550/arXiv.1706.03762
  34. Wright, S. A., & Schultz, A. E. (2018). The rising tide of artificial intelligence and business automation: Developing an ethical framework. Business Horizons, 61(6), 823–832. https://doi.org/10.1016/j.bushor.2018.07.001
  35. Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. 23. https://doi.org/10.48550/arXiv.1609.08144
  36. Zhuang, Z., Liu, M., Cutkosky, A., & Orabona, F. (2022). Understanding AdamW through Proximal Methods and Scale-Freeness. 24. https://doi.org/10.48550/arXiv.2202.00089
DOI: https://doi.org/10.2478/bsrj-2026-0010 | Journal eISSN: 1847-9375 | Journal ISSN: 1847-8344
Language: English
Page range: 204 - 222
Submitted on: Oct 21, 2024
Accepted on: Aug 15, 2025
Published on: May 10, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year

© 2026 Timotej Jagrič, Aljaž Herman, published by IRENET - Society for Advancing Innovation and Research in Economy
This work is licensed under the Creative Commons Attribution 4.0 License.