AI-Driven Peer Company Identification: A Semantic Text-Similarity Approach Beyond Traditional Industry Classification Systems
By: Timotej Jagrič and Aljaž Herman
References
- Banerjee, K., Prasad C, V., Gupta, R. R., Vyas, K., Anushree, H., & Mishra, B. (2020). Exploring Alternatives to Softmax Function. https://doi.org/10.48550/arXiv.2011.11538
- Bonne, G., Lo, A. W., Prabhakaran, A., Siah, K. W., Singh, M., Wang, X., Zangari, P., & Zhang, H. (2022). An Artificial Intelligence-Based Industry Peer Grouping System. Journal of Financial Data Science, 4(2), 9–36. https://doi.org/10.3905/jfds.2022.1.090
- Car, T., Šimac, I., & Šuman, S. (2025). Framing prompts as user stories: Effects on the output quality of generative AI. ENTRENOVA - ENTerprise REsearch InNOVAtion, 11(1). https://doi.org/10.54820/entrenova-2025-0057
- Caelen, O. (2017). A Bayesian interpretation of the confusion matrix. Annals of Mathematics and Artificial Intelligence, 81(3–4), 429–450. https://doi.org/10.1007/s10472-017-9564-8
- Chang, Y., Kong, L., Jia, K., & Meng, Q. (2021). Chinese named entity recognition method based on BERT. Proceedings of 2021 IEEE International Conference on Data Science and Computer Application, ICDSCA 2021, 294–299. https://doi.org/10.1109/ICDSCA53499.2021.9650256
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. 16. https://doi.org/10.48550/arXiv.1810.04805
- Ding, K., Peng, X., & Wang, Y. (2019). A machine learning-based peer selection method with financial ratios. Accounting Horizons, 33(3), 75–87. https://doi.org/10.2308/acch-52454
- Eaton, G. W., Guo, F., Liu, T., & Officer, M. S. (2022). Peer selection and valuation in mergers and acquisitions. Journal of Financial Economics, 146(1), 230–255. https://doi.org/10.1016/j.jfineco.2021.09.006
- Fildor, D., & Pejić Bach, M. (2023). Testing the ability of ChatGPT to categorise urgent and non-urgent patient conditions: Who ya gonna call? ENTRENOVA - ENTerprise REsearch InNOVAtion Journal, 9(1), 101–112. https://doi.org/10.54820/entrenova-2023-0010
- Gordon, M., & Kochen, M. (1989). Recall-precision trade-off: A derivation. Journal of the American Society for Information Science, 40(3), 145–151. https://doi.org/10.1002/(SICI)1097-4571(198905)40:3<145::AID-ASI1>3.0.CO;2-I
- Goutte, C., & Gaussier, E. (2005). A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. Lecture Notes in Computer Science, 3408, 345–359. https://doi.org/10.1007/978-3-540-31865-1_25
- He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778.
- Hurtik, P., Tomasiello, S., Hula, J., & Hynar, D. (2022). Binary cross-entropy with dynamical clipping. Neural Computing and Applications, 34(14), 12029–12041. https://doi.org/10.1007/s00521-022-07091-x
- Husmann, S., Shivarova, A., & Steinert, R. (2022). Company classification using machine learning. Expert Systems with Applications, 195. https://doi.org/10.1016/j.eswa.2022.116598
- Jagrič, T., & Herman, A. (2024). AI Model for Industry Classification Based on Website Data. Information, 19. https://doi.org/10.3390/info15020089
- Kingma, D. P., & Ba, J. (2017). Adam: A Method for Stochastic Optimization. 15. https://doi.org/10.48550/arXiv.1412.6980
- Krislock, N., & Wolkowicz, H. (2012). Euclidean distance matrices and applications. International Series in Operations Research and Management Science, 166, 879–914. https://doi.org/10.1007/978-1-4614-0769-0_30
- Lei Ba, J., Kiros, J. R., & Hinton, G. E. (2016). Layer Normalization. 14. https://doi.org/https://doi.org/10.48550/arXiv.1607.06450
- Makridakis, S. (2017). The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms. Futures, 90, 46–60. https://doi.org/10.1016/j.futures.2017.03.006
- Mao, A., Mohri, M., & Zhong, Y. (2023). Cross-Entropy Loss Functions: Theoretical Analysis and Applications. 40th International Conference on Machine Learning, 23803–23828.
- Momeni, M., Mohseni, M., & Soofi, M. (2015). Clustering Stock Market Companies via K-Means Algorithm. Kuwait Chapter of Arabian Journal of Business and Management Review, 4(5), 1–10. https://doi.org/10.12816/0018959
- Noels, S., De Ridder, S., Viaene, S., & De Bie, T. (2023). An efficient graph-based peer selection method for financial statements. Intelligent Systems in Accounting, Finance and Management, 30(3), 120–136. https://doi.org/10.1002/isaf.1539
- Pascanu, R., Mikolov, T., & Bengio, Y. (2013). On the difficulty of training recurrent neural networks. 30th International Conference on Machine Learning, ICML 2013, PART 3, 2347–2355.
- Phillips, R. L., & Ormsby, R. (2016). Industry classification schemes: An analysis and review. Journal of Business and Finance Librarianship, 21(1), 1–25. https://doi.org/10.1080/08963568.2015.1110229
- Pejić Bach, M., Krstić, Ž., Seljan, S., & Turulja, L. (2020). Text mining for big data analysis in financial sector: A literature review. Sustainability, 12(3), 1155. https://doi.org/10.3390/su12031155
- Puvvala, C. (2019). Company classification. Retrieved from: https://www.kaggle.com/datasets/charanpuvvala/company-classification/data
- Qu, C., Yang, L., Qiu, M., Bruce Croft, W., Zhang, Y., & Iyyer, M. (2019). BERT with history answer embedding for conversational question answering. SIGIR 2019 - Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 1133–1136. https://doi.org/10.1145/3331184.3331341
- Rahutomo, F., Kitasuka, T., & Aritsugi, M. (2012). Semantic Cosine Similarity. The 7th International Student Conference on Advanced Science and Technology. Retrieved from https://www.researchgate.net/profile/Faisal-Rahutomo/publication/262525676_Semantic_Cosine_Similarity/links/0a85e537ee3b675c1e000000/Semantic-Cosine-Similarity.pdf
- Rojas, R. (1996). The Backpropagation Algorithm. Neural Networks, 149–182. https://doi.org/10.1007/978-3-642-61068-4_7
- Ruby, U. A., Theerthagiri, P., Jacob, J. I., Vamsidhar, Y. (2020). Binary cross entropy with deep learning technique for Image classification. International Journal of Advanced Trends in Computer Science and Engineering, 9(4), 5393–5397. https://doi.org/10.30534/ijatcse/2020/175942020
- Santini, S., & Jain, R. (1999). Similarity measures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(9), 871–883. https://doi.org/10.1109/34.790428
- Shala Riza, L., Abazi Bexheti, L., & Zoroja, J. (2025). Early identification of at-risk students in online education: A deep learning approach to predictive modelling. Business Systems Research, 16(2), 69–91. https://doi.org/10.2478/bsrj-2025-001
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., N., G., Kaiser, Ł., & Polosukhin, I. (2023). Attention Is All You Need. 15. https://doi.org/10.48550/arXiv.1706.03762
- Wright, S. A., & Schultz, A. E. (2018). The rising tide of artificial intelligence and business automation: Developing an ethical framework. Business Horizons, 61(6), 823–832. https://doi.org/10.1016/j.bushor.2018.07.001
- Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., … Dean, J. (2016). Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. 23. https://doi.org/10.48550/arXiv.1609.08144
- Zhuang, Z., Liu, M., Cutkosky, A., & Orabona, F. (2022). Understanding AdamW through Proximal Methods and Scale-Freeness. 24. https://doi.org/10.48550/arXiv.2202.00089
Language: English
Page range: 204 - 222
Submitted on: Oct 21, 2024
Accepted on: Aug 15, 2025
Published on: May 10, 2026
In partnership with: Paradigm Publishing Services
Publication frequency: 2 issues per year
Keywords:
Related subjects:
© 2026 Timotej Jagrič, Aljaž Herman, published by IRENET - Society for Advancing Innovation and Research in Economy
This work is licensed under the Creative Commons Attribution 4.0 License.