OpenAI's o3 coding model has demonstrated prowess rivaling that of the top 200 human players globally. However, its hallucination rate has surged to 33%, a staggering doubling compared to o1. Scientists at Ai2 have pinpointed over-reliance on reinforcement learning (RL) as the critical flaw behind this phenomenon. As o3's performance continues to enhance, it becomes increasingly susceptible to generating erroneous information.
