Companies are going all-in on artificial intelligence right now, investing millions or even billions into the area while slapping the AI initialism on their products, even when doing so seems strange and pointless.
Heavy investment and increasingly powerful hardware tend to mean more expensive products. To discover if people would be willing to pay extra for hardware with AI capabilities, the question was asked on the TechPowerUp forums.
The results show that over 22,000 people, a massive 84% of the overall vote, said no, they would not pay more. More than 2,200 participants said they didn’t know, while just under 2,000 voters said yes.
I’m still not sold that dynamic text generation is going to be the major near-term application for LLMs, much less in games. Like, don’t get me wrong, it’s impressive what they’ve done. But I’ve also found it to be the least-practically-useful of the LLM model categories. Like, you can make real, honest-to-God solid usable graphics with Stable Diffusion. You can do pretty impressive speech generation in TortoiseTTS. I imagine that someone will make a locally-runnable music LLM model and software at some point if they haven’t yet; I’m pretty impressed with what the online services do there. I think that there are a lot of neat applications for image recognition; the other day I wanted to identify a tree and seedpod. Someone hasn’t built software to do that yet (that I’m aware of), but I’m sure that they will; the ability to map images back to text is pretty impressive. I’m also amazed by the AI image upscaling that Stable Diffusion can do, and I suspect that there’s still room for a lot of improvement there, as that’s not the main goal of Stable Diffusion. And once someone has done a good job of building a bunch of annotated 3d models, I think that there’s a whole new world of 3d.
I will bet that before we see that becoming the norm in games, we’ll see LLMs regularly used for either pre-generated speech synth or in-game speech synthesis, so that the characters say text (which might be procedurally-generated, aren’t just static pre-recorded samples, but aren’t necessarily generated from an LLM). Like, it’s not practical to have a human voice actor cover all possible phrases with static recorded speech that one might want an in-game character to speak.
I think it’s coming pretty fast. There’s already a mod for Skyrim that lets you talk to your companion. People are spending hours talking to llms and roleplaying, the first triple A game to incorporate it is going to bee a massive hit imo. I’m actually surprised no one’s been coming out with visual novels using them, it seems like a perfect use case.
It’s definitely going to be used first for making the content of the game like you said though.
there are some local genai music models, although I don’t know how good they are yet as I haven’t tried any myself (stable audio is one, but I’m sure there are others)
also minor linguistic nitpick but LLM stands for ‘language model’ (you could maybe get away with it for pixart and sd3 as they use t5 for prompt encoding, which is an llm, i’m sure some audio models with lyrics use them too), the term you’re looking for is probably ‘generative’