Microsoft has launched three new in‑house AI models—MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2—available now via Microsoft Foundry and the newly introduced MAI Playground, marking a bold step toward competing with OpenAI and Google in AI model development and deployment, according to NewsBytes and VentureBeat.
What Was Announced
MAI‑Transcribe‑1 is a speech‑to‑text model delivering advanced transcription capabilities across 25 languages with notably low word error rates. NewsBytes highlighted its competitive performance relative to OpenAI’s Whisper‑large‑v3 and Google’s Gemini 3.1 Flash Lite across various benchmarks.
MAI‑Voice‑1 is a voice generation engine that can produce audio rapidly while preserving speaker identity and allowing custom voice creation from just a few seconds of input audio, according to NewsBytes and corroborated by various media coverage.
MAI‑Image‑2 is an image generation model noted by NewsBytes for ranking among the top three on the Arena.ai leaderboard and offering significantly faster visual output—roughly twice as fast as its predecessor—across Microsoft’s platforms.
Where and How Available
All three models are now accessible through Microsoft Foundry and the MAI Playground, enabling commercial use, evaluation, and integration into applications. Several reports confirm their availability via Azure AI infrastructure and deployment in Microsoft tools such as Copilot, Bing, and PowerPoint.
Why It Matters
This launch underscores Microsoft's increasing focus on building its own AI stack rather than relying solely on external partners like OpenAI. As noted by VentureBeat, this move represents a strategic shift to gain better control over model performance, cost, and integration.
Conclusion
With MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2, Microsoft is escalating its AI ambitions. By offering in‑house models with high performance, broad language support, and scalable deployment, the company positions itself as a rising contender to OpenAI and Google in the enterprise AI arena.