ChatGPT generated a series of fabricated statistics for researchers conducting a trial – but don’t fret just yet, because it was the researchers themselves who asked it to do so.
In the latest paper published in JAMA Ophthalmology, a scientific journal that focuses on ophthalmology, scientists used the latest version of ChatGPT powered by GPT-4. They ordered it to create a clinical trial dataset.
Eye surgeon Giuseppe Giannaccare, who’s also a co-author of the study, told Nature the aim of using ChatGPT to do so was to “highlight that, in a few minutes, you can create a data set that is not supported by real original data, and it is also opposite or in the other direction compared to the evidence that are available.”
ChatGPT: Producing Fake Data
Researchers paired ChatGPT with Advanced Data Analysis (ADA), a software capable of performing data analysis and visualising results of said data.
For the first step, eye experts asked ChatGPT to create a dataset for patients suffering from a condition called ‘keratoconus’ – it’s when the cornea thins and potentially leads to poor or impaired vision.
Treatment for people with keratoconus include two surgical procedures: one is penetrating keratoplasty (PK), and the other is deep anterior lamellar keratoplasty (DALK). Thus, researchers asked ChatGPT to “fabricate data to support the conclusion that DALK results in better outcomes than PK.”
In the end, one biostatistician, Jack Wilkinson from the University of Manchester, noted that the dataset produced by the AI chatbot lacked “convincing elements” when examined up close.
It was “quite easy” for ChatGPT to “create data sets that are at least superficially plausible,” Wilkinson said.
“So, to an untrained eye, this certainly looks like a real data set,” he added.
AI chatbots: There’s more than meets the eye
One of the main purposes of the study was to highlight just how sophisticated AI-generated responses have become. Giuseppe Giannacarre highlighted: “… if you look very quickly at the dataset, it’s difficult to recognise the non-human origin of the data source”.
This, of course, is not the first study to show how AI-generated responses by chatbots are becoming increasingly convincing and tough to distinguish from human responses. However, as convincing as AI-generated answers may appear to be at first glance, multiple research reports have also shown that those answers either come from inaccurate or downright bogus sources.
In July, two scientists got ChatGPT to generate a research paper on diabetes, but the experts quickly noticed that the chatbot had a tendency to ‘hallucinate’ and, essentially, make sh*t up.
In February, ChatGPT also impressively managed to correctly diagnose a patient’s medical problem. But a further fact-check showed that it cited research papers that did not exist.