Mingwalling Intelligence Teams Up with Tsinghua University to Unveil China's Pioneering 1.58-bit Large Model, BitCPM-CANN
10 hour ago / Read about 0 minute
Author:小编   

Mingwalling Intelligence, in partnership with Tsinghua University and the OpenBMB open-source community, has proudly announced the release and open-sourcing of China's inaugural ternary (1.58-bit) large model, BitCPM-CANN. This groundbreaking model has been meticulously trained on Huawei's Ascend platform, representing a significant leap forward in the realm of low-bit large model training. It comes in four distinct size variants: 0.5B, 1B, 3B, and 8B, catering to diverse needs and applications.

During the inference stage, BitCPM-CANN showcases its prowess by substantially conserving video memory, achieving an impressive approximate sixfold reduction in memory consumption. Remarkably, the 8B parameter variant of the model operates seamlessly on mainstream flagship smartphones, demonstrating its versatility and efficiency.

BitCPM-CANN has been engineered with a comprehensive low-bit training framework, leveraging the power of MindSpeed and Megatron-LM. In a move towards transparency and accessibility, all model weights have been made openly available. Users can now effortlessly access these weights through the renowned HuggingFace and ModelScope platforms, fostering a collaborative and innovative ecosystem.