Kimi Unveils Trillion-Parameter Model K2, Outperforming Global Open Source Benchmarks
4 day ago / Read about 0 minute
Author:小编   

The Dark Side of the Moon team, Kimi, has announced the release of their groundbreaking model K2, leveraging the Mixture of Experts (MoE) architecture, alongside the opening of its source code. Boasting an impressive total of 1 trillion parameters, with 32 billion active parameters, K2 excels in areas such as autonomous programming, tool invocation, and mathematical reasoning, surpassing other leading global open source models. Leveraging the MuonClip optimizer, K2 enables the efficient training of trillion-parameter models, addressing the challenge of handling such large datasets. To combat the scarcity of high-quality data, K2 has innovated by enhancing token efficiency, thereby opening up new avenues for pre-training expansion. Notably, K2 showcases exceptional code capabilities and robust Agent task processing skills, demonstrating strong generalization and practical applicability across multiple real-world scenarios. The new model, K2, is now available for open exploration and experience.