general

Microsoft Launches New MAI Models via Foundry as In‑House AI Push Intensifies

Microsoft has recently launched three in‑house AI models—MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2—through its Foundry platform, signaling a strategic step toward greater independence from OpenAI.

Microsoft has made three in‑house AI models—MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2—commercially available via its Foundry platform, marking a strategic move toward greater autonomy in AI development, according to Forbes. The models span key enterprise modalities: speech transcription, voice generation, and image creation bespoke to Microsoft's infrastructure.

What Was Announced

MAI‑Transcribe‑1 is a speech‑to‑text model Microsoft describes as having the lowest word‑error rate across the top 25 languages by Microsoft product usage. Forbes notes its competitive performance compared to industry-leading models such as OpenAI's Whisper‑large‑v3 and Google’s Gemini, positioning it as a strong contender in multiple benchmarks. This model is now available commercially via Foundry.

MAI‑Voice‑1 is a text‑to‑speech model capable of generating audio with preserved speaker identity and emotional nuance—producing 60 seconds of audio in under a second—available through Foundry.

MAI‑Image‑2 is a text‑to‑image generation model ranked third on the Arena.ai leaderboard. It is similarly offered through Foundry.

Strategic Significance

This rollout suggests Microsoft is building a technical foundation to reduce reliance on OpenAI, while maintaining access to OpenAI’s models under its revised partnership, according to Forbes.

Why This Matters

For enterprise tech leaders, the availability of diverse, in‑house AI models across multiple modalities—speech, voice, and vision—offers greater flexibility in AI strategy and potentially more cost‑effective, integrated solutions within Azure.

Note: No verifiable sources confirm the specific model names “MAI‑1” or “Phi‑3” in the context of new Azure AI Foundry launches. Evidence supports only the MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2 models.

Conclusion

Microsoft’s release of three new MAI models via Foundry marks an important strategic evolution toward internal AI model development. References to “MAI‑1” and “Phi‑3” were not corroborated by available sources at this time.