At the '2025 AI Container Application Implementation and Development Forum', Huawei, in partnership with Shanghai Jiao Tong University, Xi'an Jiaotong University, and Xiamen University, officially unveiled and open-sourced its AI container technology, Flex:ai. This innovative solution is designed to enable refined management and intelligent scheduling of computing power resources, thereby accelerating the widespread adoption of AI technologies.
Zhou Yuefeng, Vice President of Huawei, emphasized that while AI technology has witnessed rapid advancements, achieving its 'democratization'—making it accessible and usable for a broader audience—still presents significant hurdles. He cited the healthcare industry as a prime example, highlighting the prevalent challenges in computing power scheduling. Liu Miao, also from Huawei, identified three primary pain points in current AI computing power utilization: an excess of computing capacity on single cards for small tasks, insufficient computing capacity on single machines for large tasks, and difficulties in concurrent scheduling of multiple tasks.
To tackle these challenges, the three universities conducted in-depth technical research from diverse perspectives. Professor Qi Zhengwei from Shanghai Jiao Tong University introduced the XPU resource pooling framework, which achieves spatial sharing and resource isolation, enhancing the efficiency of computing power usage. Professor Zhang Yiming, affiliated with both Xiamen University and Shanghai Jiao Tong University, developed cross-node remote virtualization technology, effectively aggregating idle XPU computing power across different nodes. Meanwhile, Professor Zhang Xingjun from Xi'an Jiaotong University created the Hi Scheduler, enabling fine-grained computing power scheduling for more precise control.
Huawei has open-sourced the full-stack technology of Flex:ai, with plans to continuously iterate and optimize it in collaboration with the universities. Zhang Gong from Huawei also pointed out that enterprises still face numerous challenges when deploying AI inference, particularly in cross-node migration and large-scale cluster scheduling. These remain urgent issues that need to be addressed to further advance the adoption and efficiency of AI technologies.
