Have a personal or library account? Click to login
Detecting LLM-assisted writing in scientific communication: Are we there yet? Cover

Detecting LLM-assisted writing in scientific communication: Are we there yet?

Open Access
|Jul 2024

Figures & Tables

Figure 1.

A schematic view of the LLM-Assisted Writing (LAW) detector. The detection process consists of two phases: First, during training, manuscripts are converted into vectors representing the author’s writing style using the technique provided in (Lazebnik & Rosenfeld, 2023). The average change and standard deviation of the presented writing style are measured to capture the dynamics in one’s writing style. Then, during inference, for each manuscript, we examine whether the change in its author’s writing style is substantial enough to be considered an anomaly and whether this anomaly is aligned with the style of an LLM-generated manuscript of the same title and abstract. If both conditions are met, the manuscript is deemed as an LLM-assisted manuscript.
A schematic view of the LLM-Assisted Writing (LAW) detector. The detection process consists of two phases: First, during training, manuscripts are converted into vectors representing the author’s writing style using the technique provided in (Lazebnik & Rosenfeld, 2023). The average change and standard deviation of the presented writing style are measured to capture the dynamics in one’s writing style. Then, during inference, for each manuscript, we examine whether the change in its author’s writing style is substantial enough to be considered an anomaly and whether this anomaly is aligned with the style of an LLM-generated manuscript of the same title and abstract. If both conditions are met, the manuscript is deemed as an LLM-assisted manuscript.

Pairwise Cohan’s κs calculated for the five detectors_ Each cell contains the results for the assessment set on the left, and the results for the false positive set on the right_

DetectLLMZipPyConDALAW
LLMDet0.86 / 0.820.68 / 0.740.67 / 0.720.63 / 0.69
DetectLLM 0.72 / 0.760.67 / 0.750.59 / 0.62
ZipPy 0.86 / 0.960.77 / 0.88
ConDA 0.81 / 0.90

List of manuscripts included in the assessment set_

LLM-assisted WritingCounterpart
Osterrieder, J., GPTChat, A Primer on Deep Reinforcement Learning for Finance, SSRN (2023)Finance, F., Osterrieder, J., Generative Adversarial Networks in finance: an overview, arXiv (2021)
Biswas, S., Will ChatGPT take my Job? Replies and Advice by ChatGPT, SSRN (2023)Biswas, S., Role of Sonography in Ocular Trauma: A Study, ARC Journal of Surgery (2021)
Askr, H., Darwish, A., Hassanien, A.E., ChatGPT, The Future of Metaverse in the Virtual Era and Physical World: Analysis and Applications. Studies in Big Data (2023)Gad, I., Hassanien, A. E., A wind turbine fault identification using machine learning approach based on pigeon inspired optimizer, Tenth International Conference on Intelligent Computing and Information Systems (2021)
King, M. R., chatGPT, A Conversation on Artificial Intelligence, Chatbots, and Plagiarism in Higher Education, Cellular and Molecular Bioengineering (2023)King, M. R., CMBE Moves to the Structured Abstract Format: A Note from the Editor, Cellular and Molecular Bioengineering (2017)
Kung et al., Performance of ChatGPT on USMLE: Potential for AI-Assisted Medical Education Using Large Language Models, medRxiv (2022)Kung, H. K., Host physician perspectives to improve predeparture training for global health electives, medical education (2017)
O’Connor S., Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse?, Nurse Education in Practice (2022)O’Connor S., Exoskeletons in Nursing and Healthcare: A Bionic Future, Clinical nursing research (2021)
Rossoni, L., A inteligencia artificial e eu: escrevendo ô editorial juntamente com o ChatGPT, Revista Eletronicâ de Ciencia Administrativa (2022)Rossoni, L., Editorial: A RECADM no Redalyc e o Dilema das Bases e Indexadores, Revista Eletronica dê Ciencia Administrativa (2021)
chatGPT, Zhavoronkov, A., Rapamycin in the context of Pascal’s Wager: generative pre-trained transformer perspective, Oncoscience (2022)Zhavoronkov, A., The inherent challenges of classifying senescence, Science (2020)
Biswas, S., ChatGPT and the Future of Medical Writing, Radiology (2023)Biswas, S., Biswas, S., A Study on penile doppler, MedCrave Online Journal of Surgery (2017)
Lazebnik, T., ChatGPT, The Impact of Fruit and Vegetable Consumption and Physical Activity on Diabetes Risk among Adults, arXiv (2022)Lazebnik, T., Bunimovich-Mendrazitsky, S., The Signature Features of COVID-19 Pandemic in a Hybrid Mathematical Model—Implications for Optimal Work–School Lockdown Policy, Advanced Theory and Simulations (2021)
BaHammam, A. S., Trabelsi, K., Pandi-Perumal, S. R., Jahrami, H., Adapting to the Impact of AI in Scientific Writing: Balancing Benefits and Drawbacks while Developing Policies and Regulations, Journal of Nature and Science of Medicine (2023)Akhtar, N., Ravi Gupta, S.R. Pandi-Perumal, Ahmed S. BaHammam: Clinical Atlas of Polysomnography: A Book Review, Sleep and Vigilance (2021)

The performance of the examined detectors (columns) on the assessment set (first row) and the false-positive set (second row)_ The performance is presented as the accuracy with the F1-score in brackets (for the assessment set) and as the false positive rate (for the false-positive set)_

ModelLLMDetDetectLLMZipPyConDALAW
Accuracy0.5460.5910.6370.6370.727
F1-score0.2860.4710.6000.6000.700
Recall0.3340.5340.6270.6270.700
Precision0.2500.4210.5750.5750.700
False Positive17.2%13.8%9.7%8.8%3.1%

Pairwise comparison between the five detectors_ The results are shown as p value with the statistics in brackets_ Each cell contains the results for the assessment set on the left, and the results for the false positive set on the right_

LLMDetDetectLLMZipPyConDA
DetectLLM0.66(0.19)/< 0.01(10.45)
ZipPy0.38(0.78)/< 0.01(69.63)0.66(0.20)/< 0.01(20.96)
ConDA0.06(3.67)/< 0.01(95.71)0.66(0.20)/< 0.01(34.21)1.0(0.0)/0.28(1.13)
LAW0.01(0.03)/< 0.01(729.19)0.15(2.06)/< 0.01(34.21)0.34(0.92)/< 0.01(161.74)0.34(0.92)/< 0.01(120.46)
DOI: https://doi.org/10.2478/jdis-2024-0020 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 4 - 13
Submitted on: Mar 6, 2024
Accepted on: Jun 26, 2024
Published on: Jul 9, 2024
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2024 Teddy Lazebnik, Ariel Rosenfeld, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.