MAI-UI: Tongyi’s Full-Fledged GUI Intelligent Agent Foundation Model Now Open Source—Built-in with Native User Interaction Abilities

2025-12-29 / Read about 0 minute

Author：小编

On December 29, 2025, Tongyi Lab’s Multimodal Interaction Team made a groundbreaking announcement: the open-sourcing of MAI-UI, a versatile foundation model for GUI (Graphical User Interface) intelligent agents. This model comes in four distinct parameter sizes—2B, 8B, 32B, and 235B-A22B—and leverages Qwen3-VL as its foundational network. MAI-UI stands out as the first model to integrate three essential capabilities within a single architecture: user interaction, MCP tool invocation, and edge-cloud synergy. It has showcased top-tier performance in benchmark assessments focused on GUI visual localization and mobile task execution. Presently, the MAI-UI-2B and MAI-UI-8B variants are accessible for open-source use on the Hugging Face platform.

Previous page：Large Language Models: Still Struggling to Reliabl...

Next page：Zuckerberg Makes Bold Move: Meta Buys AI Agent Dev...

Return to List

Hot Reading

2 day ago

iPhone 18 Pro Camera Control Button Is Getting a Revamp, Says Reports—What's Changing?

2 day ago

Tesla is un-canceling its plan to build a smaller, cheaper EV: report

2 day ago

Waymo robotaxis are tracking potholes and sharing that data with Waze users

2 day ago

No, the Viral iPhone Fold Video Isn't Real. How We Know It's Fake