ChatGPT - TTLG Forums

Azaran on 31/1/2023 at 21:01

For those of you not following the Thief forums, (https://www.ttlg.com/forums/showthread.php?t=152062&p=2497189#post2497189) there's now a proper AI audio replicator, and it's incredible. Upload a voice recording, and it will generate anything you want with that voice almost perfectly

demagogue on 31/1/2023 at 23:19

Welp, here's the rendition of Principal Skinner reading a paragraph on action sentences from the book I'm reading that you didn't know you wanted to hear: (https://voca.ro/15AYghHnUfvl)

We're really in uncharted waters now, folks.

Edit: By the way, here's the link to the site running the tech so you don't have to dig for it: (https://beta.elevenlabs.io/)

heywood on 31/1/2023 at 23:25

I believe these guys did that 6 years ago:

(https://www.descript.com/lyrebird) https://www.descript.com/lyrebird

More recently there was this:

(https://www.theverge.com/2022/6/23/23179748/amazon-alexa-feature-mimic-voice-dead-relative-ai) https://www.theverge.com/2022/6/23/23179748/amazon-alexa-feature-mimic-voice-dead-relative-ai

I think it's creepy.

Azaran on 31/1/2023 at 23:29

I knew of other programs years back, but they were all crappy. This is eons ahead of the ones I had seen

Cipheron on 1/2/2023 at 01:46

Quote Posted by Azaran

All the stuff for automated NPC conversations in RPGs is really coming together. People are already linking up GPT, voice to text and text to voice in Unreal Engine mods etc, so that you can actually converse with NPCs like real people.

How that could work with the game's lore as far as I understand it, is about how they know which context-relevant data to send to the model. Basically, you break the lore into chunks (could be automated), then each chunk gets run through the language model, to generate an embedding (basically, a 1D vector which describes the context of the lore). You then store these chunks of lore along with their embeddings, and when you have a request, that request gets embedded too, then you just search the database for lore chunks which are close to that embedding and those get sent to the language model as additional data.

I'm pretty sure this is how they're doing it with ChatGPT any time it talks as if has encyclopedia type knowledge. The language model itself hasn't internalized all of that, they're kinda "cheating" by finding the appropriate "cheat notes" then uploading them along with your prompt.

So yeah, that would be great to use for NPCs to give them access to the background lore for your world in the game. Imagine some future version of Dwarf Fortress or something, with procedurally generated world history, but on top of that it has AI-driven chatbots for the NPCs which get fed the appropriate snippets of world lore automatically.

demagogue on 1/2/2023 at 01:59

Yeah I remember dreaming of that back in the day, and now it's pretty straightforward to see the pipeline. What's significant is how high quality it can be, not to mention how quickly it's generated.

The only thing is that the models for ChatGPT and Prime Voice are probably in the teens to 10s of GB; so if you didn't want to store that locally, the game would have to call it remotely. But that's also a thing that's standard anymore.Then again, so is installing games with large file sizes.

But even with all that said, if it's for a focused game, you might be able to get away with pretty small models (just a few GB) just for what's relevant to the game, and it'll probably still work great for the purposes of the game.

I guess we can't forget that procedural art and gameplay can also be a part of it. I think soon enough this is gonna be a whole genre unto itself, where we have to learn new rules. While I can see a lot of it being sludge, I can also see a lot of creative people figuring out how to wrestle really good writing and content with it as well.

Cipheron on 1/2/2023 at 04:06

Quote Posted by demagogue

No, the scale is more than that. GPT-3 had 175 billion parameters. If you use 32 bit, then it would be 700GB of RAM required to hold a model. However they probably want to use 64 bit values, so that's 1.4 TB of RAM needed.

ChatGPT is probably bigger than that, but they've said GPT-4 will only be about 50% more parameters than GPT-3. Which makes sense, since 2 TB of RAM is basically the limit for high-end server motherboards. So they're effectively pushing up against the limits for regular server-grade hardware here, and that's what's constraining the models getting much bigger right now.

EDIT: You can find some articles (bullshit ones) floating around claiming that GPT-4 will be 170 trillion parameters, especially this one:

(https://medium.com/geekculture/gpt-4-100x-more-powerful-than-gpt-3-38c57f51e4e3)

Quote:

GPT-4–100X More Powerful than GPT-3
...
GPT-4 is significantly larger and more powerful than GPT-3, with 170 trillion parameters compared to GPT-3's 175 billion parameters.

... Except that firstly, that's nonsense. And also, 170 trillion is 1000x the size of 175 billion, not 100x, so it's not even mathematically accurate nonsense.

Such a piece of software would require 1.36 Petabytes of RAM to store the parameters at 64-bit. This would mean the software could only be run on a handful of the most powerful supercomputers in the world, instead of being rolled out on commercial server hardware.

Here's the article about the top supercomputers in the world.
(https://en.wikipedia.org/wiki/TOP500)

It contains a top 10 list, and the 10th most powerful machine, Tianhe-2 in China only has 1,375 TiB total memory, so even it could barely fit a 64-bit 170 trillion-parameter model in RAM, and then only if you crammed both the GPU memory and CPU memory full of the parameters. So this is literally stuff that could only run on a number of billion-dollar machines worldwide you could count on your fingers.

demagogue on 1/2/2023 at 04:17

Ah, I was thinking about the Stable Diffusion (AI art) model as a reasonable comparator, but I guess it makes sense it'd be a lot bigger for a language model. They call them large language models (LLMs) after all.

Okay, so short of some ridiculous memory storage tech in the near future (not ruling anything out anymore, but I'm not betting on it too soon), it'd need to be on the studio's servers and able to scale to a lot of users.

But I was already thinking that was going to become more standard in the near future anyway, cf. MSFS and some other big data games. Something like this would just accelerate that trend.

heywood on 1/2/2023 at 14:37

Forget about training an AI to mimic a person's voice, that path leads into an ethical minefield. The missing piece is a parametric voice generator, where you can specify a gender, range, style, accent, and other parameters necessary to make a convincing but unique-sounding voice. At that point, voice acting as a profession is effectively dead.

Azaran on 1/2/2023 at 15:30

Quote Posted by heywood

Well guess what

Inline Image:

https://i.postimg.cc/BndQXKCb/Capture.png