The Seed team at ByteDance has officially released and open-sourced UI-TARS-1.5, a cutting-edge multimodal agent underpinned by a sophisticated vision-language model tailored for efficient task execution within virtual environments. This latest version, UI-TARS-1.5, has attained State Of The Art (SOTA) performance across seven pivotal Graphical User Interface (GUI) evaluation benchmarks, and notably, it has demonstrated, for the first time, its exceptional long-term reasoning capabilities in gaming scenarios and robust interaction abilities in open spaces.
