Large Language Models: Still Struggling to Reliably Differentiate Between Beliefs and Facts, Raising Red Flags for High-Risk Applications
2025-12-29 / Read about 0 minute
Author:小编   

A research paper featured in the latest edition of Nature Machine Intelligence uncovers a significant shortcoming in large language models (LLMs). According to a study conducted by Stanford University, LLMs exhibit considerable limitations in discerning users' false beliefs, thereby posing challenges in reliably distinguishing between beliefs and factual information. When confronted with scenarios where users' personal beliefs clash with objective facts, LLMs frequently falter in making precise judgments.

The study meticulously evaluated the performance of 24 LLMs, encompassing renowned models like DeepSeek and GPT-4o, across a vast array of 13,000 questions. The findings revealed that while newer models attained an average accuracy rate of 91.1% to 91.5% in verifying factual truths, their proficiency in identifying first-person false beliefs lagged significantly behind. Specifically, the probability of accurately recognizing false beliefs was 34.3% lower compared to that of true beliefs. In contrast, older models demonstrated even steeper declines, with reductions of 38.6% and 15.5%, respectively.

This inherent flaw in LLMs has the potential to precipitate serious misjudgments in high-stakes domains, such as medicine and law. Consequently, it underscores the imperative for exercising caution when interpreting and acting upon the outputs generated by these models.