Lin Junyang, the former leader of Alibaba's QianWen technology division, penned an insightful article following his resignation, asserting that the trajectory of AI large-scale models is undergoing a transformation from "reasoning-centric cognition" to "agent-centric cognition." The inaugural wave of reasoning models heralded a fresh epoch in the AI landscape, ushering in an era characterized by the amplification of post-training reinforcement learning scales, with mathematics and coding emerging as pivotal domains for refining model accuracy. He delved into the intricacies and obstacles inherent in implementing a "converged cognitive and directive paradigm." The QianWen team endeavored to amalgamate these methodologies but encountered inherent conflicts, subsequently introducing distinct variants. Meanwhile, entities such as Anthropic and DeepSeek persist in their exploration of hybrid architectural frameworks. Lin prognosticated that the era of merely elongating internal reasoning pathways within models is nearing its conclusion, with agent-centric cognition poised to assume a dominant role in the foreseeable future. Agent-based reinforcement learning will reshape the demands on the technical stack, with thwarting reward hacking emerging as a formidable challenge, and the industry's competitive edge will pivot towards systems engineering prowess.
