With new in-house models, Microsoft lays the groundwork for independence from OpenAI
3 day ago / Read about 8 minute
Source:ArsTechnica
Microsoft is still deeply tied to OpenAI, but who knows what the future holds.


Credit: Microsoft

Microsoft has introduced AI models that it trained internally and says it will begin using them in some products. This announcement may represent an effort to move away from dependence on OpenAI, despite Microsoft's substantial investment in that company. It comes more than a year after insider reports revealed that Microsoft was beginning work on its own foundational models.

A post on the Microsoft AI blog describes two models. MAI-Voice-1 is a natural speech-generation model meant to deliver "high-fidelity, expressive audio across both single and multi-speaker scenarios." The idea is that voice will be one of the main ways users interact with AI tools in the future, though we haven't really seen that come to fruition so far.

The second model is called MAI-1-preview, and it's a foundational large language model specifically trained to drive Copilot, Microsoft's AI chatbot tool. It was trained on around 15,000 Nvidia H100 GPUs, and runs inference on a single GPU. As reported last year, this model is significantly larger than the models seen in Microsoft's earlier experiments, which focused on smaller models meant to run locally, like Phi-3.

To date, Copilot has primarily depended on OpenAI's models. Microsoft has invested enormous amounts of money in OpenAI, and it's unlikely the two companies will fully divorce any time soon. That said, there have been some tensions in recent months when their incentives or objectives have strayed out of alignment.

Since it's hard to predict where this is all going, it's likely to Microsoft's longterm advantage to develop its own models.

It's also possible Microsoft has introduced these models to address use cases or queries that OpenAI isn't focused on. We're seeing a gradual shift in the AI landscape toward models that are more specialized for certain tasks, rather than general, all-purpose models that are meant to be all things to all people.

These new models follow that somewhat, as Microsoft AI lead Mustafa Suleyman said in a podcast with The Verge that the goal here is "to create something that works extremely well for the consumer... my focus is on building models that really work for the consumer companion."

As such, it makes sense that we're going to see these models rolling out in Copilot, which is Microsoft's consumer-oriented AI chat bot product. Of MAI-1-preview, the Microsoft AI blog post specifies, "this model is designed to provide powerful capabilities to consumers seeking to benefit from models that specialize in following instructions and providing helpful responses to everyday queries."

So yes, MAI-1-preview has a target audience in mind, but it's still a general-purpose model since Copilot is a general-purpose tool.

MAI-Voice-1 is already being used in Microsoft's Copilot Daily and Podcasts features. There's also a Copilot Labs interface that you can visit right now to play around with it, giving it prompts or scripts and customizing what kind of voice or delivery you want to hear.

MA1-1-preview is in public testing on LMArena and will be rolled out to "certain text use cases within Copilot over the coming weeks."