ByteDance's VAPO Framework Breaks AIME24 Record, Significantly Boosting Large Language Model Inference Capabilities

2025-04-12 / Read about 0 minute

Author：小编

ByteDance has unveiled the VAPO reinforcement learning training framework, designed to bolster the inference prowess of large language models in complex, extended tasks. Building upon the PPO framework, VAPO integrates cutting-edge technologies such as value training, length-adaptive generalized advantage estimation, and a synergistic effect system. Following optimization, the Qwen2.5-32B model's score on the AIME24 benchmark skyrocketed from 5 points to 60.4 points, outperforming the DeepSeek R1 and DAPO methods. VAPO particularly shines in mathematical reasoning and long sequence tasks, offering a more stable and efficient training process. The harmonious integration of these technologies underpins VAPO's exceptional performance.

Previous page：Trump Administration Aims to Slash NASA Science Bu...

Next page：ChatGPT Claims Top Spot as World's Most Downloaded...

Return to List

Hot Reading

2 day ago

OpenClaw security fears lead Meta, other AI firms to restrict its use

2 day ago

Apple's AirDrop Now Works With More Pixel Phones

2 day ago

Polestar aims to shake off EV doldrums with 4 new models in 3 years

2 day ago

5 changes to know about in Apple's latest iOS, macOS, and iPadOS betas