Have a personal or library account? Click to login
Unstructured documents processing through their anonymization Cover

Unstructured documents processing through their anonymization

By: Peter Kvasnica  
Open Access
|Jun 2025

References

  1. Ministry of Justice of Slovak Republic, (2013). Courts Act No. 757/2004 Coll. https://obcan.justice.sk/infosud/zoznam/rozhodnutie
  2. Ministry of Justice of Slovak Republic, (2015). Decree on the publication of court decision No. 482/2011 Coll. https://www.slov-lex.sk/pravne-predpisy/SK/ZZ/2011/482/20120101
  3. Ministry of Justice of Slovak Republic, (2010). Court on Free Access to Information No. 211/2000 Coll. https://www.slovlex.sk/pravne-predpisy/SK/ZZ/2000/211/20160701
  4. P. Kvasnica, “Creation of datasets for machine learning in the anonymsation of unstructured documents“. In: Proceedings of the 22and ISC’2024 Industrial Simulation Conference, pp. 13–18, ISBN 978-9-492589-30-3, 2024.
  5. E. M. Nrl, et al., “MUC-7 EVALUATION OF IE TECHNOLOGY“, Overview of Results MUC-7 Program Committee. In Program, 1998.
  6. H. H. Hock, B. D. Joseph, Language History, Language Change, and Language Relationship: An Introduction to Historical and Comparative Linguistics [online]. [s.l.]: De Gruyter, ISBN 9783110214307, 2009.
  7. D. Hládek, et al. “The Slovak morphological classifier“. ELMAR, 2012 Proceedings, pp. 12–14, September, 2012. Available online: <http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6338504\nhttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6338504>.
  8. O. Kaššák, Extrakcia pomenovaných entít pre slovenslý jazyk. Znalosti 2012. Sborník příspěvků 11. ročníku konference. Praha: Matfyzpress, pp. 52-61, 2012.
  9. S. Chakrabarti, Mining the web, Discovering knowledge form hypertext data. Morgan Kaufmann publishers, ISBN-13: 9781558607545, 2002.
  10. K. Han, Y. Song, H. Rim, Probabilistic Model for Definitional Question Answering. Korea University, Seoul, Korea, 2005.
  11. B. Salton, G. Salton & C. Buckley, “Term weighting approaches in automatic text retrieval“. Information Processing and Management, 24(5), pp. 513-523, 1988.
  12. S. Dumais, Enhancing performance in LSI retrieval. Technical Report 91/09/17, Bellcore, 1991.
  13. T. M. Mitchell, Machine Learning, The McGraw-Hill Companies, Inc., New York, USA, 414 ps, 1997.
  14. I. H. Witten, and E. Frank, DATA MINING, Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, 2005.
  15. P. Stenetorp, et al., “A web-based tool for NLP-assisted text annotation. Proc. of the Demonstrations at the 13th Conf. of the European Chapter of the Association for Computational Linguistics. ACL, pp. 102-107, 2012.
  16. A. Bagga, and B. Baldwin, “Algorithms for scoring coreference chains“. The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference, pp. 563–566, 1998.
  17. C. D. Manning, et al., “The Stanford CoreNLP Natural Language Processing Toolkit“. Proc. of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, ACL, pp. 55–60, 2014. Available online: http://aclweb.org/anthology/P14-5010.
  18. P. Bednár, P. Butka, J. Paralic, Java Library for Support of Text Mining and Retrieval. ZNALOSTI 2005, Stará Lesná, Vyd. Univerzity Palackého Olomouc, pp. 162-169, ISBN 80-248-0755-6, 2005.
  19. Ministry of Justice of Slovak Republic, (2010). Information on the componenets of anonymization services, accesible from the justice.sk domain. http://intranet/intranet/sudy
  20. Spring by VMWare Tanzu (2021). How to bild effective agents, Retrievel Augmented Generation. https://docs.spring.io/spring-ai/reference/api/retrieval-augmented-generation.html
DOI: https://doi.org/10.2478/jee-2025-0028 | Journal eISSN: 1339-309X | Journal ISSN: 1335-3632
Language: English
Page range: 275 - 283
Submitted on: Apr 4, 2025
|
Published on: Jun 19, 2025
In partnership with: Paradigm Publishing Services
Publication frequency: 6 issues per year

© 2025 Peter Kvasnica, published by Slovak University of Technology in Bratislava
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.