heywood on 1/2/2023 at 16:22
In 2000, I played Deus Ex and enjoyed the in-game conversation with Morpheus. It was just a traditional dialogue tree, but it was deeper and more thoughtfully written than any NPC interaction I had in a game up to that point. Ever since then, I've been dreaming about getting rid of canned NPC barks and dialogue trees in RPGs and imm sims, and replacing them with a conversational AI back end.
Giving each character a unique and original voice is one of the challenges. Giving each character a unique and original personality is something entirely different.
Qooper on 1/2/2023 at 17:47
ChatGPT can almost write a decent Monty Python sketch:
Quote:
Title: "The Swift Watchmaker"
Scene: A watchmaker (Mr. Thompson) is working in his shop when the king (His Majesty) enters, accompanied by his court.
His Majesty: (excited) Mr. Thompson, I've heard that you are the most skilled watch maker in all the land. I need your assistance!
Mr. Thompson: (bowing) Of course, Your Majesty. How may I be of service?
His Majesty: (pulls out a pocket watch) I have a special request. I need you to make a watch that is so accurate, it will never lose a second.
Mr. Thompson: (impressed) That is quite a challenge, Your Majesty, but I am confident that I can create such a watch for you.
(Mr. Thompson sets to work, and several days later, presents the finished watch to His Majesty)
Mr. Thompson: (proudly) Here it is, Your Majesty. The most accurate watch in the world. It will never lose a second.
His Majesty: (excitedly) Let's see it in action!
(Mr. Thompson winds the watch and sets it on the table. It starts ticking at a frantic pace, so fast that it sounds like a buzz)
His Majesty: (stunned) What is this? This is not a watch, it's a blur!
Mr. Thompson: (realizing his mistake) Oh, my. I seem to have made it a bit too quick.
His Majesty: (disappointed) This is unacceptable, Mr. Thompson. I needed a watch that was accurate, not one that runs faster than a horse.
Mr. Thompson: (embarrassed) I am so sorry, Your Majesty. I will fix it, I promise.
His Majesty: (sighing) I hope so, Mr. Thompson. I have important events that require punctuality, and this watch is worthless to me.
(Mr. Thompson takes the watch back to his shop and sets to work. Several days later, he presents a corrected watch to His Majesty)
Mr. Thompson: (nervously) Here it is, Your Majesty. The corrected watch.
His Majesty: (cautiously) Let's see it in action.
(Mr. Thompson winds the watch and sets it on the table. It ticks at a slow, steady pace)
His Majesty: (relieved) Ah, much better. This is what I was looking for.
Mr. Thompson: (bowing) I am so glad I could meet your expectations, Your Majesty.
His Majesty: (smiling) Yes, you have done well, Mr. Thompson. And as a reward, I will give you a gold coin for every second this watch runs without fail.
Mr. Thompson: (delighted) Thank you, Your Majesty!
(The king pockets the watch and starts to walk out of the shop, but his steps become faster and faster with each tick of the watch. He tries to slow down, but his feet are moving out of control)
His Majesty: (panicking) Help! What have you done to me, Mr. Thompson?
End scene.
Twist on 1/2/2023 at 20:30
If I were young and entering the job market for the first time in the next 3 to 5 years or later, I would start putting significant time & effort into studying & practicing how to work with these different AIs to curate & optimize their output or response.
That skill alone may become more important than any individual skill being executed by an AI.
And this appears to be headed for every field. I anticipate Cipheron being spot on about a new kind of digital divide developing.
Azaran on 1/2/2023 at 20:53
(
https://www.theverge.com/2023/1/31/23579289/ai-voice-clone-deepfake-abuse-4chan-elevenlabs) You know this would happen
I heard a related post, an AI recording (using Eleven AI) of David Attenborough spouting the most vile racist vitriol known to man. It's frighteningly convincing. I imagine this will have a huge impact on scandals, the justice system/audio evidence, etc. People actually caught on recordings saying things they shouldn't will claim it's AI generated, etc.
heywood on 1/2/2023 at 21:28
Oh yeah, we could see that coming.
I think we'll adjust to deep fakes rather quickly. Existing cryptographic technology like certificates and blockchains already provide what's necessary to answer questions of authenticity and provenance when it's important. People will choose to believe what they want despite it.
And Twist, my head is nodding. Machine learning and gene editing are the most transformational technologies I've seen since the WWW. And you can play with ML at home.
demagogue on 1/2/2023 at 22:15
I guess this is flirting with further abuse, but I hope one of the features in the works is to take an existing vocal track and have it replaced with a custom voice. I'm thinking about the pipeline for making new characters in games, like for Dark Mod, but it'd apply to any game a person is making, or even if they want to mod new characters into other people's games.
If you've ever dug through the audio files of a game, there will typically be 100s or 1000s of audio snippets for each character that can easily take 10s of hours of recording. Usually when you want to make a new character, the really hard part isn't the modeling and animation so much as making a new complete audio set because of all the hours of recording, it has to be on the right equipment, if you mess up you have to have the person come in or do it themselves on another day, and you probably can't add to the set after the initial recordings, etc.
But if you could just batch process an existing set with a new character's voice, that would make that whole pipeline a whole lot easier. I guess you'd want a lot of new dialog, and you can still do that too. But as a quick job, it's probably easy to use an existing set, plus you can't really get the proper prosody (emotion and emphasis in voice) in with the text prompter, and there are a lot of things like grunts, breaths, and death screams where the change in voice is enough, and for a lot of things you can probably just shuffle phrases around and it'd be novel enough.
-----
Edit: And if you have that much, the pipeline is also easy to see how you could replace people in TV shows and games with a few photos of them and a 2 minute clip of them talking. I somehow feel like this is going to become really common really soon.
Azaran on 2/2/2023 at 05:56
(
https://boards.4channel.org/v/thread/626049265/) This comment from the 4chan thread on the audio replicator says it all
Quote:
The real answer is they knew exactly what was going to happen, and counted on it. Unfiltered launch would drive the numbers through the roof, which they can then show their investors to gouge more money out of them, while reaping the bonus brownie points for tightening the screws.
heywood on 2/2/2023 at 12:36
I was thinking along similar lines demagogue. Text to speech has come a long way, but can you give them acting prompts? Can you ask them to say something in a way that expresses emotion?
demagogue on 2/2/2023 at 13:38
I uploaded Krusty the Clown's voice, and the voice set that came back completely cut the affectation out of his voice, so it was just an emotionless, flat voice, if you can believe it. It's like you're just listening to the voice actor's normal voice.
That said, if you drop the stability and similarity sliders, use expressions that typically have emotional valence, and repeat lines enough, you'll get some tracks with some emotion. And I think the way it works is that once it's got a valence, it sticks with that for the rest of the track. So if you can get it really worked up for the first line, you can get the emotion into later lines. (It also works the other way around where, if the first line is low energy, the whole track will be.)
It'd be better if there were prosody sliders; but as they charge by the word, it's kind of easy to see why they went with a system where you have to make a track 8 times until you get the voice as close as you can get to what you want.