A New Leap in AI Voice Technology
Microsoft has taken a big step forward in artificial intelligence with the introduction of native audio generation in Copilot. Unlike standard text to speech tools, this feature uses the company’s advanced MAI Voice 1 model to deliver voices that sound expressive, natural, and versatile. The update is currently part of the Copilot Labs experience and is available to anyone with a personal Microsoft account.
Three Distinct Modes for Different Needs
Copilot’s audio generation offers three modes designed for various scenarios. The Scripted mode provides a clear and direct reading of the text, making it suitable for formal announcements, document narration, or training content. The Emotive mode adds dramatic flair with varied pitch, tone, and intonation, perfect for marketing campaigns, social media content, and creative storytelling. The Story mode takes things further by introducing multiple voices and characters, making it ideal for podcasts, narrative content, and educational storytelling.
Powered by MAI Voice 1
The backbone of this new feature is Microsoft’s MAI Voice 1 model. Built and trained on around 15,000 Nvidia GPUs, it can generate a full minute of audio in less than a second on a single GPU. This means high speed processing without compromising on voice quality. According to Microsoft AI CEO Mustafa Suleyman, the model has been fine tuned to ensure that generated voices feel less robotic and more human like, making Copilot stand out from other AI powered assistants.
Availability and Future Expansion
Right now, native audio generation is free to use through Copilot Labs, though Microsoft has not disclosed whether usage limits will be applied later. The company has also not confirmed when this feature will expand to Copilot mobile and desktop apps. Given the growing demand for natural AI voices, it is expected that the rollout will reach broader audiences soon.
Why This Matters for Users
For businesses, creators, and everyday users, Copilot’s audio generation opens new possibilities. Whether you want to turn a script into an ad ready voiceover, create immersive podcast content, or simply make documents more engaging through narration, the tool provides flexibility and quality in just a few clicks.
Follow Tech Moves on Instagram and Facebook to stay updated on the latest in AI, technology, and innovations shaping the way we work and create.