References
- Abdullah, M. H. A., Aziz, N., Abdulkadir, S. J., Akhir, E. A. P., & Talpur, N. (2022). Event detection and information extraction strategies from text: A preliminary study using GENIA corpus. In International Conference on Emerging Technologies and Intelligent Systems(pp. 118-127). Cham: Springer International Publishing.
- Abdullah, M. H. A., Aziz, N., Abdulkadir, S. J., Alhussian, H. S. A., & Talpur, N. (2023). Systematic literature review of information extraction from textual data: Recent methods, applications, trends, and challenges. IEEE Access, 11, 10535-10562. https://doi.org/10.1109/ACCESS.2023.3240898
- Adnan, K., & Akbar, R. (2019a). An analytical study of information extraction from unstructured and multidimensional big data. Journal of Big Data, 6(1), 91. https://doi.org/10.1186/s40537-019-0254-8
- Adnan, K., & Akbar, R. (2019b). An analytical study of information extraction from unstructured and multidimensional big data. Journal of Big Data, 6(1), 91. https://doi.org/10.1186/s40537-019-0254-8
- Adnan, K., Akbar, R., Khor, S. W., & Ali, A. B. A. (2019). Role and challenges of unstructured big data in healthcare. In N. Sharma, A. Chakrabarti, & V. E. Balas (Eds.), Data management, analytics and innovation: Proceedings of ICDMAI 2019 (Vol. 1, pp. 301-323). Springer
- Akkurt, F., Gungor, O., Marşan, B., Gungor, T., Ozturk Basaran, B., Özgür, A., & Uskudarli, S. (2024). Evaluating the quality of a corpus annotation scheme using pretrained language models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 6504-6514). Torino, Italia.
- Akmal, M., & Romadhony, A. (2020). Corpus development for Indonesian product named entity recognition using semi-supervised approach. In 2020 international conference on data science and its applications (ICoDSA) (pp. 1-5). IEEE.
- Alkaissi, H., & McFarlane, S. I. (2023). Artificial hallucinations in ChatGPT: Implications in scientific writing. Cureus, 15(2), e35179. https://doi.org/10.7759/cureus.35179
- Bossy, R., Jourde, J., Manine, A.-P., Veber, P., Alphonse, E., van de Guchte, M., Bessières, P., & Nédellec, C. (2012). BioNLP Shared Task - The Bacteria Track. BMC Bioinformatics, 13(Suppl 11), S3. https://doi.org/10.1186/1471-2105-13-S11-S3
- Buchholz, S., & Marsi, E. (2006). CoNLL-X Shared Task on multilingual dependency parsing. In Proceedings of the tenth conference on computational natural language learning (CoNLL-X) (pp. 149-164).
- Cohen, K. B., Lanfranchi, A., Choi, M. J., Bada, M., Baumgartner, W. A., Jr., Panteleyeva, N., Verspoor, K., Palmer, M., & Hunter, L. E. (2017). Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles. BMC Bioinformatics, 18(1), 372. https://doi.org/10.1186/s12859-017-1775-9
- Csanády, B., Muzsai, L., Vedres, P., Nádasdy, Z., & Lukács, A. (2024). LlamBERT: Large-scale low-cost data annotation in NLP. arXiv. https://doi.org/10.48550/arXiv.2403.15938
- Deléger, L., Bossy, R., Chaix, E., Ba, M., Ferré, A., Bessières, P., & Nédellec, C. (2016). Overview of the Bacteria Biotope Task at BioNLP Shared Task 2016. In Proceedings of the 4th BioNLP Shared Task Workshop (pp. 12-22). Berlin, Germany.
- Frei, J., & Kramer, F. (2023). Annotated dataset creation through large language models for non-English medical NLP. Journal of Biomedical Informatics, 145, 104478. https://doi.org/10.1016/j.jbi.2023.104478
- Gao, C. A., Howard, F. M., Markov, N. S., Dyer, E. C., Ramesh, S., Luo, Y., & Pearson, A. T. (2023). Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers. NPJ Digital Medicine, 6, Article 75. https://doi.org/10.1038/s41746-023-00774-5
- Gao, J., Zhao, H., Yu, C., & Xu, R. (2023). Exploring the feasibility of ChatGPT for event extraction. arXiv. https://doi.org/10.48550/arXiv.2303.03836 Retrieved March 01, 2023, from https://ui.adsabs.harvard.edu/abs/2023arXiv230303836G
- Grynbaum, M. M., & Mac, R. (2023). The Times sues OpenAI and Microsoft over A.I. use of copyrighted work. The New York Times. Retrieved 15 April 2024 from https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html
- Hadi, M. U., Al-Tashi, Q., Qureshi, R., Shah, A., Muneer, A., Irfan, M., Zafar, A., Shaikh, M., Akhtar, N., Wu, J., & Mirjalili, S. (2023). Large language models: A comprehensive survey of its applications, challenges, limitations, and future prospects. https://doi.org/10.36227techrxiv.23589741.v4
- Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), Article 248. https://doi.org/10.1145/3571730
- Jurafsky, D., Chai, J., Schluter, N., & Tetreault, J. (2020). Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online. https://aclanthology.org/2020.acl-main.0.pdf
- Kim, J.-D., Ohta, T., & Tsujii, J. (2008). Corpus annotation for mining biomedical events from literature. BMC Bioinformatics, 9(1), 10. https://doi.org/10.1186/1471-2105-9-10
- Kim, J.-D., Wang, Y., Takagi, T., & Yonezawa, A. (2011). Overview of GENIA event task in BioNLP shared task 2011. In Proceedings of the BioNLP Shared Task 2011 Workshop, (pp. 7-15). Portland, Oregon, USA.
- Kim, J.-D., Ohta, T., Tateisi, Y., & Tsujii, J. (2003). GENIAcorpus-semantically annotated corpus for bio-textmining. Bioinformatics, 19(Suppl 1), i180-182. https://doi.org/10.1093/bioinformatics/btg1023
- Lever, J., Altman, R., & Kim, J.-D. (2020). Extending TextAE for annotation of non-contiguous entities. Genomics Inform, 18(2), e15. https://doi.org/10.5808/GI.2020.18.2.e15
- Li, G., Wang, P., Xie, J., Cui, R., & Deng, Z. (2022). FEED: A Chinese financial event extraction dataset constructed by distant supervision, In Proceedings of the 10th International Joint Conference on Knowledge Graphs, Virtual Event, Thailand. https://doi.org/10.1145/3502223.3502229
- Li, M., Shi, T., Ziems, C., Kan, M.-Y., Chen, N. F., Liu, Z., & Yang, D. (2023). Coannotating: Uncertainty-guided work allocation between human and large language models for data annotation. arXiv. https://doi.org/10.48550/arXiv.2310.15638
- Li, Z. (2023). The dark side of ChatGPT: legal and ethical challenges from stochastic parrots and hallucination. arXiv. https://doi.org/10.48550/arXiv.2304.14347
- Lin, Y. (2020). Multilingual multitask joint neural information extraction (Doctoral dissertation, University of Illinois at Urbana-Champaign). https://hdl.handle.net/2142/109521
- Lin, Y., Ji, H., Huang, F., & Wu, L. (2020). A joint neural model for information extraction with global features. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 7999-8009).
- Linguistic Data Consortium (2005). ACE (Automatic Content Extraction) English annotation guidelines for events. https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/english-events-guidelines-v5.4.3.pdf
- Liu, X., Luo, Z., & Huang, H. (2018). Jointly multiple events extraction via attention-based graph information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing Brussels, Belgium.
- McIntosh, T. R., Liu, T., Susnjak, T., Watters, P., Ng, A., & Halgamuge, M. N. (2024). A culturally sensitive test to evaluate nuanced GPT hallucination. IEEE Transactions on Artificial Intelligence, 5(6), 2739-2751. https://doi.org/10.1109/TAI.2023.3332837
- Metz, C., & Robertson, K. (2024). OpenAI Seeks to Dismiss Parts of The New York Times’s Lawsuit. The New York Times. Retrieved 15 April 2024 from https://www.nytimes.com/2024/02/27/technology/openai-new-york-times-lawsuit.html
- Mirzakhmedova, N., Gohsen, M., Chang, C. H., & Stein, B. (2024). Are large language models reliable argument quality annotators? In Conference on Advances in Robust Argumentation Machines (pp. 129-146). Cham: Springer Nature Switzerland.
- Nawaz, R., Thompson, P., McNaught, J., & Ananiadou, S. (2010). Meta-Knowledge Annotation of Bio-Events. In LREC (Vol. 17, pp. 2498-2507).
- Nédellec, C., Bossy, R., Chaix, E., & Deléger, L. (2018). Text-mining and ontologies: New approaches to knowledge discovery of microbial diversity. arXiv. https://doi.org/10.48550/arXiv.1805.04107
- Neves, M., & Leser, U. (2012). A survey on annotation tools for the biomedical literature. Briefings in Bioinformatics, 15(2), 327-340. https://doi.org/10.1093/bib/bbs084
- Neves, M., & Ševa, J. (2019). An extensive review of tools for manual annotation of documents. Briefings in Bioinformatics, 22(1), 146-163. https://doi.org/10.1093/bib/bbz130
- O’Donnell, M. (2008). The UAM CorpusTool: Software for corpus annotation and exploration. In Proceedings of the XXVI Congreso de AESLA (Vol. 3, p. 5). Spain: Almeria
- Ohta, T., Kim, J.-D., & Tsujii, J. (2007). Guidelines for event annotation. Department of Information Science, Graduate School of Science, University of Tokyo
- Ohta, T., Pyysalo, S., Rak, R., Rowley, A., Chun, H.-W., Jung, S.-J., Choi, S.-P., Ananiadou, S., & Tsujii, J. (2013). Overview of the Pathway Curation (PC) task of BioNLP Shared Task 2013. In Proceedings of the BioNLP Shared Task 2013 Workshop Sofia, Bulgaria.
- Ohta, T., Pyysalo, S., & Tsujii, J. (2011). Overview of the epigenetics and post-translational modifications (EPI) task of BioNLP shared task 2011. In Proceedings of BioNLP Shared Task 2011 Workshop (pp. 16-25).
- Papazian, F., Bossy, R., & Nédellec, C. (2012). AlvisAE: A collaborative web text annotation editor for knowledge acquisition. In Proceedings of the Sixth Linguistic Annotation Workshop (pp. 149-152).
- Pestian, J., Brew, C., Matykiewicz, P., Hovermale, D. J., Johnson, N., Cohen, K. B., & Duch, W. (2007). A shared task involving multi-label classification of clinical free text. In Biological, translational, and clinical language processing (pp. 97-104).
- Pyysalo, S., Ohta, T., & Ananiadou, S. (2013). Overview of the Cancer Genetics (CG) task of BioNLP Shared Task 2013. In Proceedings of the BioNLP Shared Task 2013 Workshop Sofia, Bulgaria.
- Pyysalo, S., Ohta, T., Miwa, M., Cho, H.-C., Tsujii, J., & Ananiadou, S. (2012). Event extraction across multiple levels of biological organization. Bioinformatics, 28(18), i575-i581. https://doi.org/10.1093/bioinformatics/bts407
- Pyysalo, S., Ohta, T., Rak, R., Sullivan, D., Mao, C., Wang, C., Sobral, B., Tsujii, J., & Ananiadou, S. (2011a). Annotation guidelines for infectious diseases event corpus. In Tech rep, Tsujii Laboratory, University of Tokyo.
- Pyysalo, S., Ohta, T., Rak, R., Sullivan, D., Mao, C., Wang, C., Sobral, B., Tsujii, J., & Ananiadou, S. (2011b). Overview of the Infectious Diseases (ID) task of BioNLP Shared Task 2011. In J. Tsujii, J.-D. Kim, & S. Pyysalo, Proceedings of BioNLP Shared Task 2011 Workshop Portland, Oregon, USA.
- Pyysalo, S., Ohta, T., Rak, R., Sullivan, D., Mao, C., Wang, C., Sobral, B., Tsujii, J., & Ananiadou, S. (2012). Overview of the ID, EPI and REL tasks of BioNLP shared task 2011. BMC Bioinformatics, 13(Suppl 11), S2. https://doi.org/10.1186/1471-2105-13-S11-S2
- Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., & Tsujii, J. (2012). BRAT: A web-based tool for NLP-assisted text annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics (pp. 102-107).
- Stenetorp, P., Topić, G., Pyysalo, S., Ohta, T., Kim, J.-D., & Tsujii, J. (2011). BioNLPShared Task 2011: Supporting resources. In Proceedings of Bionlp Shared Task 2011 Workshop Portland, Oregon, USA.
- Talpur, N., Abdulkadir, S. J., Alhussian, H., Hasan, M. H., Aziz, N., & Bamhdi, A. (2022). A comprehensive review of deep neuro-fuzzy system architectures and their optimization methods. Neural Computing and Applications, 34(3), 1837-1875. https://doi.org/10.1007/s00521-021-06807-9
- Talpur, N., Abdulkadir, S. J., Akhir, E. A. P. A., Hasan, M. H., Alhussian, H., & Abdullah, M. H. A. (2023). A novel bitwise arithmetic optimization algorithm for the rule base optimization of deep neuro-fuzzy system. Journal of King Saud University-Computer and Information Sciences, 35(2), 821-842. https://doi.org/10.1016/j.jksuci.2023.01.020
- Tan, Z., Beigi, A., Wang, S., Guo, R., Bhattacharjee, A., Jiang, B., Karami, M., Li, J., Cheng, L., & Liu, H. (2024). Large language models for data annotation: A survey. arXiv. https://doi.org/10.48550/arXiv.2402.13446
- Törnberg, P. (2024). Best practices for text annotation with large language models. arXiv. https://doi.org/10.48550/arXiv.2402.05129
- Vauth, M., Hatzel, H. O., Gius, E., & Biemann, C. (2021). Automated event annotation in literary texts. In Proceedings of the Conference on Computational Humanities Research, CHR2021, (pp.333-345). Amsterdam, The Netherlands.
- Walker, C., Strassel, S., Medero, J., & Maeda, K. (2006). ACE 2005 multilingual training corpus (LDC2006T06) [Data set]. Linguistic Data Consortium. https://doi.org/10.35111/mwxc-vh88
- Wang, X., Wang, Z., Han, X., Jiang, W., Han, R., Liu, Z., Li, J., Li, P., Lin, Y., & Zhou, J. (2020). MAVEN: A massive general domain event detection dataset. arXiv. https://doi.org/10.48550/arXiv.2004.13590
- Wang, Y., Mishra, S., Alipoormolabashi, P., Kordi, Y., Mirzaei, A., Arunkumar, A., Ashok, A., Dhanasekaran, A. S., Naik, A., Stap, D., Pathak, E., Karamanolakis, G., Lai, H. G., Purohit, I., Mondal, I., Anderson, J., Kuznia, K., Doshi, K., Patel, M., … Khashabi, D. (2022). Super-NaturalInstructions: Generalization via declarative instructions on 1600+ NLP tasks. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing Abu Dhabi, United Arab Emirates. https://arxiv.org/abs/2204.07705
- Wu, H., Lei, Q., Zhang, X., & Luo, Z. (2020). Creating a large-scale financial news corpus for relation extraction. In 2020 3rd International Conference on Artificial Intelligence and Big Data (ICAIBD) (pp. 259-263). IEEE. https://doi.org/10.1109/ICAIBD49809.2020.913744
- Xi, X., Lv, J., Liu, S., Ye, W., Yang, F., & Wan, G. (2022). MUSIED: A benchmark for event detection from multi-source heterogeneous Informal Texts. arXiv. https://doi.org/10.48550/arXiv.2211.13896
- Xu, R., Liu, T., Li, L., & Chang, B. (2021). Document-level event extraction via heterogeneous graph-based interaction model with a tracker. In C. Zong, F. Xia, W. Li, & R. Navigli (EDs), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Online
- Yang, H., Chen, Y., Liu, K., Xiao, Y., & Zhao, J. (2018). DCFEE: A document-level Chinese financial event extraction system based on automatically labeled training data. In Proceedings of ACL 2018, System Demonstrations (pp. 50-55).
- Yao, F., Xiao, C., Wang, X., Liu, Z., Hou, L., Tu, C., Li, J., Liu, Y., Shen, W., & Sun, M. (2022). LEVEN: A large-scale Chinese legal event detection dataset. arXiv. https://doi.org/10.48550/arXiv.2203.08556
- Zaman, G., Mahdin, H., Hussain, K., & Rahman, A. (2020). Information extraction from semi-and unstructured data sources: A systematic literature review. ICIC Express Letters, 14(6), 593-603.
- Zheng, S., Cao, W., Xu, W., & Bian, J. (2019). Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 337-346). Hong Kong, China. https://doi.org/10.18653/v1/D19-1032