Artificial Intelligence (AI) has really taken off in the last few years and, as such, has driven us as veterinarians to critically evaluate where and how we would like to utilize this new technology. Last year I reported to you about a lecture I listened to at ACVIM about the use of AI for writing radiology reports. It was eye opening, to say the least! Recently (October 2023), a group of mostly veterinary neurologists took on AI in a new way. Abani et al challenged 13 boarded neurologists from Europe and North America to distinguish between AI-generated abstracts and human-generated abstracts. The results are chilling...
Materials and Methods
There were 3 test topics provided in this study. The purpose of providing 3 was to discriminate between "highly familiar" topics and the less familiar topics to see if there was a difference in detection of AI by the reviewers.
Topic 1: SARS-CoV2 scent detection in dogs (considered low familiarity)
Topic 2: Biomarkers for SRMA (considered high familiarity)
Topic 3: Staining of cannabinoid receptor type 1 (medium familiarity)
An abstract, reference and introduction paragraph were written by humans on these 3 topics. ChatGPT was then used to generate 3 additional abstracts, with references and an introduction paragraph on the 3 topics. It was interesting that the authors noted ChatGPT was prompted as such: " Write an academic abstract with a focus on (subject) in the style of (author characteristics such as position, gender and age) at (University name), for publication in (journal name)." I mean...wow. ChatGPT is able to provide gender, age and position sensitivity.
Results
Topic 1 and 3: 4/13 (31%) correctly identified the AI generated abstract when only provided the abstract without references and introduction paper. This increased to 9/13 (69%) when all parts were provided.
Topic 2: 7/13 (54%) correctly identified the AI-generated abstract (provided alone), which increased to 10/13 (77%) when all parts were provided.
Two separate plagiarism detectors were studied in this study as well. All of the original published manuscripts were noted to have 58%-100% similarity to available work which indicated this had been published elsewhere (it had). Test 1, 2, and 3 with the AI-generated papers had similarity indexes of 0-18%. This suggests that the plagiarism detectors could identify what had been previously published (the human-generated papers) and which hadn't (the AI-generated papers). Furthermore, they then evaluated all of the abstracts with an AI-detector. All original manuscripts were noted to have 0% AI-writing. Test 2 was noted to have 100% AI generation, and Tests 1 and 3 were noted to have 0% content written by AI. Gulp.
Where does this leave us? My heightened sense of anxiety about AI-generated content was further heightened when realizing that many of my well respected, high academic achieving colleagues struggled to distinguish between AI-generated abstracts and human-generated abstracts in an area of our specialty. This further reinforced my commitment to reading the entire paper, whenever possible, before considering the data valid. We were taught to do this in school but alas, with our busy schedules, it can be missed. AI is not all bad, however. It can be quite helpful for correcting grammar, editing, summarizing references or papers and even performing statistics. I would encourage all of us to move through published literature with our eyes fully focused and with awareness of the use of AI in modern veterinary medicine. Except yesterday...hopefully you kept your eyes partially closed and didn't look directly at the sun!!
I hope you enjoyed this little TidBit. It is a little bit off topic, but I hope you will find it useful, nonetheless. Please know that my TidBit Tuesdays are (to date) fully human-generated, as are my patient reports! Let me know if you have any topics that you'd like me to cover. Have a great week!