As reported by The Robot Report, on January 21, 2026, Microsoft Research introduced a cutting-edge AI model dubbed Rho-alpha. This model is specifically engineered to boost robots' capacity for autonomous operation in intricate, real-world settings. Built upon the Phi vision-language framework, Rho-alpha uniquely merges vision, language, and tactile senses. For the first time, it can directly convert natural language instructions into robotic control signals and facilitate tasks requiring the coordination of both hands.
Rho-alpha stands out with its ability to dynamically adjust behaviors and accept real-time corrections from humans. To tackle the challenge of limited data availability, the model was trained using a blend of real-world data and synthetic data generated through Azure simulations. At present, Rho-alpha is being rigorously tested on platforms featuring dual-arm robots and humanoid robots. Looking ahead, it will be accessible to research institutions via an early access initiative.
