Moore Threads' proprietary MUSA architecture has achieved a significant milestone by successfully integrating with the open-source inference framework llama.cpp. This integration enables support for mainstream models like LLaMA and Mistral, as well as multimodal applications, thereby empowering users to efficiently run AI inference tasks on the MTT S80, S3000, and S4000 series GPUs. Prior to this, the MUSA SDK 4.0.1 had already demonstrated compatibility with Intel processors and domestic Haiguang platforms. This latest adaptation further democratizes the deployment of large models, catalyzing the growth of the local AI hardware ecosystem.