Claims Top Spot in Benchmarks, Then Reasoning Performance Crashes: Claude Opus 4.7's Reputation Tumbles Within 48 Hours of Release - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

Claims Top Spot in Benchmarks, Then Reasoning Performance Crashes: Claude Opus 4.7's Reputation Tumbles Within 48 Hours of Release

18 hour ago / Read about 0 minute

Author：小编

In April 2026, Anthropic unveiled its Claude Opus 4.7 model, which sparked a highly polarized response within just 48 hours of its launch. According to official rankings, Opus 4.7 was neck-and-neck with GPT-5.4 and Gemini 3.1 Pro, sharing the top spot on the global stage. However, its performance in public logical reasoning tests took a nosedive, plummeting from an impressive 94.7% accuracy rate to a mere 41.0%. User feedback poured in, highlighting a significant drop in the model's coding proficiency, a noticeable regression in reasoning quality, and a staggering 35% surge in token consumption. Moreover, glitches surfaced with the old interface, driving up usage costs. Collectively, users lambasted the model, labeling it as 'pricier, less intelligent, and more contentious.' Anthropic responded by explaining that iterations in large-scale models often involve a redistribution of capabilities, where enhancements in certain areas may inadvertently lead to setbacks in others. Despite this explanation, users remained unimpressed, citing the subpar real-world user experience as a major letdown.

Previous page：The Economist Cover Story: The United States Final...

Next page：DeepSeek Initiates Its Maiden External Funding Rou...

Return to List

Hot Reading

2 day ago

Loop raises $95M to build supply chain AI that predicts disruptions

2 day ago

Gemini can now search your phone's photo library to make better images

2 day ago

PS6 Portable could be the best retro gaming handheld you've ever owned

2 day ago

Startup Unicorns in 2026 Reveal How Tech Startups are Now Worth Billions