What you need to know
- Microsoft recently released an AI tool called VALL-E that can create convincing replications of people’s voices.
- The tool uses just a 3-second recording as a prompt to generate content.
- VALL-E can replicate the emotions of a speaker, differentiating it from several AI models.
Microsoft recently released an artificial intelligence tool known as VALL-E that can replicate people’s voices (via AITopics). The tool was trained on 60,000 hours of English speech data and uses 3-second clips of specific voices to generate content. Unlike many AI tools, VALL-E can replicate the emotions and tone of a speaker, even when creating a recording of words that the original speaker never said.
The voice samples shared by Microsoft range in quality. While some of them sound natural, others are clearly machine-generated and sound robotic. Of course, AI tends to get better over time, so in the future generated recordings will likely be more convincing. Additionally, VALL-E only uses 3-second recordings as a prompt. If the technology was used with a larger sample set, it could undoubtedly create more realistic samples.
At the moment, VALL-E is not generally available, which may be a good thing as AI-generated replications of people’s voices could be used in dangerous ways by threat actors and others with malicious intent.
Windows Central take: Impressive but scary
While VALL-E is undoubtedly impressive, it raises several ethical concerns. As artificial intelligence becomes more powerful, the voices generated by VALL-E and similar technologies will become more convincing. That would open the door to realistic spam calls replicating the voices of real people that a potential victim knows.
Politicians and other public figures could also be impersonated. With the speed social media travels and the polarity of political discussions, it’s unlikely that many would stop to ask if a scandalous recording were genuine, as long as it sounded at least somewhat authentic.
Security concerns also come to mind. My bank uses my voice as a password when I call. There are measures in place to detect voice recordings and I’d assume the technology could sense if a VALL-E voice was used. That beings said, it still makes me uneasy. There’s a good chance that the arms race will escalate between AI-generated content and AI-detecting software.
While not a security concern, some have brought up the fact that voice actors may lose work to VALL-E and competing tech. While it’s unfortunate to see people lose work, I don’t see a way around this. If VALL-E reaches a point where it can replace voice actors for audio books or other content, companies are going to use it. That’s just the reality of technology advancing. In fact, Apple recently announced a feature that uses AI to read audio books.
Like any technology, VALL-E will be used for good, evil, and everything in between. Microsoft has an ethics statement on the use of VALL-E, but the future of its usage is still murky. Microsoft President Brad Smith has discussed regulating AI in the past (via GeekWire). We’ll have to see what measures Microsoft puts in place to regulate the use of VALL-E.
Original Article: Microsoft’s VALL-E can imitate any voice with just a three-second sample
More from: Microsoft Research
The Latest Updates from Bing News
Go deeper with Bing News on:
- Yavapai Silent Witness raises reward in Prescott Valley homicide case
Yavapai County Silent Witness is now offering a $10,000 cash reward for information leading to an arrest in the May murder of 58-year-old Grant Griffiths of Prescott Valley. Griffiths was found dead ...
- El PSPV de la Vall d'Albaida sol·licita una comissió de seguiment del projecte de la CV-60
"El PSPV-PSOE es pregunta “si Carlos Mazón continua amb el projecte o si l’interès mostra durant la campanya electoral ha decaigut”. | Cadena SER ...
- La Vall d'Albaida
La Diputación de Valencia activará un fondo de ayudas para catástrofes naturales con el que poder atender a los municipios afectados por el incendio declarado el pasado jueves en Montitxelvo, y que ha ...
- La Vall de Bianya (Girona) acollirà un projecte arquitectònic de la Fundació RCR BUNKA
La Conselleria de Cultura de la Generalitat destinarà 950.000 euros entre 2023 i 2025 a la Fundació RCR BUNKA per desenvolupar el projecte 'La Vila. Centre de recerca i d'experiència de l'espai' a la ...
- Microsoft’s AI can mimic your voice with seconds of training
Microsoft's new text-to-speech tool VALL-E can accurately mimic speakers' tone, emotion, and acoustic environment using merely a three-second-long prompt. Microsoft researchers revealed the new ...
Go deeper with Bing News on:
- Colin Cowherd chides SI naming Deion Sanders ‘Sportsperson of the Year’: ‘It’s probably AI-generated.’
Colin Cowherd was on the Deion Sanders hype train in September. But even he questioned SI naming Sanders Sportsperson of the Year.
- AWS GenAI models safe from harmful, bias content: Swami Sivasubramanian
Amazon is taking all the necessary steps to make sure that its generative AI models are constantly monitored for any toxicity and bias so that future AI models can be built in a safe and responsible ...
- Sports Illustrated Published Articles Written By Fake AI-Generated Writers
Sports Illustrated, a go-to publication for sports news, has been accused of using articles published by fake, AI-generated writers.
- Makers of popular Dream by Wombo AI app launch a new app for AI avatars
“And increasingly, users are going to be creating AI generated media that they're posting for socials. And also the platforms themselves are going to be creating personalized content for users using ...
- AI- and human-generated online content are considered similarly credible, finds study
In a time when the Internet has become the main source of information for many people, the credibility of online content and its sources has reached a critical tipping point. This concern is ...