OpenAI Asserts GPT-5 Matches Human Performance in Multiple Professional Domains
1 week ago / Read about 0 minute
Author:小编   

On Thursday (local time), OpenAI unveiled a novel benchmark test named GDPval, designed to assess the performance disparities between its AI models and human professionals across a diverse array of industries. This test encompasses nine key sectors, such as healthcare and finance, along with 44 distinct professions. It marks OpenAI's inaugural effort to gauge the proximity of its systems to human-level performance in "work of high economic value" and constitutes a pivotal element of its mission to advance artificial general intelligence (AGI). The test outcomes reveal that GPT-5 achieves performance levels comparable to those of human experts in 40.6% of the tasks. Meanwhile, Anthropic's Claude Opus 4.1 emerges as the leader, boasting a 49% win rate. OpenAI clarified that Claude's superior score can be attributed, in part, to its prowess in chart aesthetics. The company also underscored that the current test encompasses only a limited number of tasks, with intentions to broaden its scope to incorporate more comprehensive interactive workflows in the future.

  • C114 Communication Network
  • Communication Home