TU Darmstadt

Andrea Gillhuber,

Study shows limitations of LLMs

A new study led by TU Darmstadt has revealed the limitations of AI models such as ChatGPT. The researchers came to the conclusion that it is a fallacy that LLMs can perform complex tasks correctly without human support.

© ymyphoto/Pixabay

A new study led by TU Darmstadt has revealed the limitations of AI models such as ChatGPT. The study, which will be presented at the annual conference of the Association for Computational Linguistics (ACL) in Bangkok in August, concludes that these models are less capable of learning independently than previously assumed. There is no evidence that large language models (LLMs) develop a general "intelligent" behavior that enables complex thinking or planned action.

The study focuses on so-called "emergent capabilities" - unexpected leaps in the performance of language models that were observed as they scaled up. Although these models can handle more and more language-based tasks, such as recognizing fake news or drawing logical conclusions, due to larger amounts of data and more complex structures, there is no evidence that they develop sophisticated thinking abilities, according to the researchers.

The scientists, including TU Professor Iryna Gurevych and Dr. Harish Tayyar Madabushi from the University of Bath, found that the models only acquired the ability to perform relatively simple "However, our results do not mean that AI poses no threat at all," Gurevych emphasized. "Rather, we show that the alleged emergence of complex reasoning abilities associated with certain threats is not supported by evidence and that we can control the learning process of LLMs well after all. Therefore, future research should focus on other risks posed by the models, such as their potential to be used to generate fake news."

For users of AI systems such as ChatGPT, this means that these models should not be relied upon to perform complex tasks correctly without human assistance. It is recommended to give clear instructions and provide examples. The tendency of models to produce plausible-sounding but incorrect results - known as confabulation - persists, according to the study, even though the quality of the models has improved considerably in recent times.

Advertisement
  • Xing Icon
  • LinkedIn Icon
Advertisement
Advertisement

You might also be interested in

Advertisement

TU Munich

Combining robotics and ChatGPT

Prof. Schöllig (TU Munich) uses ChatGPT to develop choreographies for swarms of drones to match the music. A safety filter prevents the flying robots from colliding. LLMs such as ChatGPT can therefore be used in robotics in principle.

read more...
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Advertisement
Subscribe to our newsletter
Advertisement
Back to home