Claims Top Spot in Benchmarks, Then Reasoning Performance Crashes: Claude Opus 4.7's Reputation Tumbles Within 48 Hours of Release
18 hour ago / Read about 0 minute
Author:小编   

In April 2026, Anthropic unveiled its Claude Opus 4.7 model, which sparked a highly polarized response within just 48 hours of its launch. According to official rankings, Opus 4.7 was neck-and-neck with GPT-5.4 and Gemini 3.1 Pro, sharing the top spot on the global stage. However, its performance in public logical reasoning tests took a nosedive, plummeting from an impressive 94.7% accuracy rate to a mere 41.0%. User feedback poured in, highlighting a significant drop in the model's coding proficiency, a noticeable regression in reasoning quality, and a staggering 35% surge in token consumption. Moreover, glitches surfaced with the old interface, driving up usage costs. Collectively, users lambasted the model, labeling it as 'pricier, less intelligent, and more contentious.' Anthropic responded by explaining that iterations in large-scale models often involve a redistribution of capabilities, where enhancements in certain areas may inadvertently lead to setbacks in others. Despite this explanation, users remained unimpressed, citing the subpar real-world user experience as a major letdown.