OpenAI Publishes Research Report: Revealing the Root Causes of the 'Hallucination' Issue in Large Language Models - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

OpenAI Publishes Research Report: Revealing the Root Causes of the 'Hallucination' Issue in Large Language Models

2 day ago / Read about 0 minute

Author：小编

OpenAI has recently published a research report that offers a comprehensive examination of the widespread 'hallucination' problem in language models. The report highlights that, despite ongoing enhancements in the capabilities of language models, the 'hallucination' phenomenon—where models generate incorrect answers with high confidence—continues to be a challenging issue to resolve.

OpenAI's latest study uncovers that the fundamental cause of this problem stems from the fact that current mainstream training and evaluation frameworks often incentivize models to 'guess' rather than prompting them to recognize the boundaries of their knowledge when confronted with uncertainty. For instance, during the SimpleQA evaluation, an older iteration of OpenAI's o4-mini model demonstrated slightly superior accuracy. However, its error rate (also known as the hallucination rate) soared to 75%, significantly surpassing the 26% rate observed in the newer gpt-5-thinking-mini model. This finding suggests that encouraging models to guess in the face of uncertainty may enhance accuracy but will also substantially elevate the occurrence of hallucinations.

The research further delves into the core reason for hallucinations, attributing it to the model's pre-training methodology, which involves learning through predicting the next word in a vast corpus of text. Nevertheless, this approach lacks feedback from 'true/false' labels, making it arduous for the model to provide precise answers to infrequent, random facts. Instead, the model can only 'fabricate' responses based on statistical probabilities, thereby giving rise to hallucinations.

OpenAI proposes that the crux of addressing this issue lies in revamping the existing evaluation system. This involves imposing stricter penalties on confidently incorrect answers while incentivizing models to express uncertainty.

Previous page：Goldman Sachs Insight: U.S. Stock Bulls Keep Faith...

Next page：CITIC Securities: AI's Role Grows, Profitability S...

Return to List

Hot Reading

2 day ago

Hyundai’s eVTOL startup Supernal pauses work following CEO and CTO departures

2 day ago

Koah raises $5M to bring ads into AI apps

2 day ago

3 best PS5 games you can finish in under 4 hours

2 day ago

Microsoft says Azure affected after cables cut in the Red Sea