Maylee - Le client mail propulsé par l'IA

Frequently Asked Questions

What are the three Microsoft MAI models announced in April 2026?+

Microsoft announced MAI-Voice-1 (text-to-speech), MAI-Transcribe-1 (audio transcription), and MAI-Image-2 (image generation). All three are available in public preview through Azure Speech, Microsoft Foundry, and the MAI Playground.

How much does MAI-Voice-1 cost?+

MAI-Voice-1 is priced at 22 dollars per million characters. For reference, narrating a typical novel of about 500,000 characters would cost approximately 11 dollars.

How fast is MAI-Voice-1 at generating speech?+

MAI-Voice-1 can generate 60 seconds of expressive audio in less than one second on a single GPU, making it suitable for real-time applications.

Can MAI-Voice-1 clone voices?+

Yes, MAI-Voice-1 supports voice cloning from audio samples as short as 3 seconds and up to 120 seconds. This feature is under gated access and requires Microsoft's approval to use.

What languages does MAI-Voice-1 support?+

MAI-Voice-1 currently supports English (US) only, with 6 prebuilt voices. Microsoft has announced plans to add support for 10 or more additional languages.

Who leads the Microsoft MAI team?+

Mustafa Suleyman, Microsoft's CEO for AI, leads the MAI Superintelligence team. He previously co-founded DeepMind and Inflection AI.

How do the MAI models integrate with existing Microsoft products?+

The MAI models power features in Copilot, Bing, and Teams. They integrate with Azure Speech (which supports over 700 voices), Microsoft Foundry, and the Azure Speech SDK for real-time synthesis in custom applications.

Can I use MAI models outside of Azure?+

The MAI models are primarily distributed through Azure Speech, Microsoft Foundry, and the MAI Playground. They are designed for the Azure ecosystem and do not offer standalone deployment outside Microsoft's cloud infrastructure.

Microsoft MAI: Voice, Transcription, and Image AI Models You Can Actually Use Today

Soizic

Microsoft Builds Its Own AI Foundation

MAI-Voice-1: Text-to-Speech That Sounds Human

Voices and Customization

Voice Cloning

Technical Integration

Pricing

MAI-Transcribe-1: Audio to Text at Scale

MAI-Image-2: Visual Content Generation

Why Microsoft Is Building In-House

Practical Applications for Businesses

Customer Communication

Content Production

Accessibility

Productivity and Email

Integration and Availability

The Competitive Landscape

What Developers Should Know

Frequently Asked Questions

Prêt à commencer ?