It’s possible you’ll not consider it this fashion, however you most likely hear AI voices on a regular basis. Whenever you’re speaking to Alexa or Siri, that’s a mannequin skilled on human speech to have the ability to say virtually something. Did you ever have a star offer you instructions on Waze? AI. And each time you watch TikTok and also you hear that barely too chipper voice talking the captions aloud, that’s AI all the best way down. Heck, Apple’s AI will even learn you a romance novel earlier than you go to mattress.
AI programs are getting good at turning textual content into plausible speech in virtually any language and virtually any voice. And on this episode of The Vergecast, the primary in our three-part miniseries on AI, that voice is mine. We skilled a bunch of various AI bots with the sound of my voice — generally studying scripts stuffed with nonsense sentences, generally importing hours of present audio from previous Vergecast episodes, generally a bit of every — to see how nicely — and the way rapidly — we may make a satisfactory AI copy of my voice.
It was… fairly wild. Right here’s the episode:
And if you need a fast comparability of the completely different instruments, first, right here’s the reference speech we used from the good Dwight Schrute:
We transcribed that textual content and fed it into each AI generator we examined. Right here’s how Podcastle interpreted it within the voice of AI David Pierce:
Right here’s what Descript did with the identical factor:
And the brand new Private Voice function in iOS 17:
And at last, ElevenLabs, simply essentially the most practical and spectacular of the instruments we examined:
Finally, I don’t assume any of the AI voices are going to interchange me. However they’re getting higher actually quick, they usually elevate each big potentialities and big questions. What does it imply that I can create a reproduction this good and that they’re going to solely get higher and simpler over time? What duties do I’ve as the one who made it? What duties do different folks have?
We’re having plenty of debates over AI music proper now, clearly, as artists’ voices are getting used to coach fashions that may make fairly convincing songs in nearly anybody’s voice. That’s going to spawn a decade of attention-grabbing courtroom circumstances and moral debates, however those self same issues are coming for simply you and me. How can we use these instruments? How can we speak about them? Is it even doable to get the nice, useful, democratizing issues from them with out all of the deepfakes and issues? We’ve acquired so much to determine and no time to lose. As a result of the tech is absolutely good proper now, and it’s getting higher actually quick.