OpenAI Unveils HealthBench: A Benchmark for AI Health Systems
2025-05-14 / Read about 0 minute
Author:小编   

OpenAI has recently introduced HealthBench, a comprehensive evaluation framework tailored for AI health systems. This pioneering effort was jointly crafted by an esteemed panel of 262 physicians from 60 diverse countries across the globe. HealthBench encompasses a vast dataset, featuring 5,000 authentic health-related conversations and an intricate set of 48,562 scoring criteria, designed to rigorously assess the medical proficiency of large AI models. To demonstrate its utility, OpenAI utilized HealthBench to evaluate models like O3 and Gemini 2.5 Pro, with O3 emerging as the top performer.