MAI-UI: Tongyi’s Full-Fledged GUI Intelligent Agent Foundation Model Now Open Source—Built-in with Native User Interaction Abilities
2025-12-29 / Read about 0 minute
Author:小编   

On December 29, 2025, Tongyi Lab’s Multimodal Interaction Team made a groundbreaking announcement: the open-sourcing of MAI-UI, a versatile foundation model for GUI (Graphical User Interface) intelligent agents. This model comes in four distinct parameter sizes—2B, 8B, 32B, and 235B-A22B—and leverages Qwen3-VL as its foundational network. MAI-UI stands out as the first model to integrate three essential capabilities within a single architecture: user interaction, MCP tool invocation, and edge-cloud synergy. It has showcased top-tier performance in benchmark assessments focused on GUI visual localization and mobile task execution. Presently, the MAI-UI-2B and MAI-UI-8B variants are accessible for open-source use on the Hugging Face platform.