Different AI labs have different priorities. For example, OpenAI has traditionally focused on consumer users, while rival Anthropic tends to target enterprises. Elon Musk’s xAI, which we recently discovered, has a particular focus on video game walkthroughs.
On Friday, Business Insider’s Grace Kay published a detailed and extensive report on xAI, the AI startup recently acquired by SpaceX, highlighting in particular how Musk is making life difficult for his employees. However, the following anecdote stood out.
In one instance last year, the model’s release was delayed by several days after Musk became dissatisfied with the chatbot’s answers to detailed questions about the video game “Baldur’s Gate,” according to people familiar with the matter. High-level engineers have been poached from other projects to improve pre-launch response.
Of course, you can imagine the frustration of a respected and experienced engineer who comes to work intending to tackle fundamental problems in knowledge and machine intelligence, only to end up getting sidetracked and helping a 54-year-old man complete a video game. But this anecdote raises a more pressing question: Did Musk finally get the gaming skills he wanted?
To answer that question, our resident RPG enthusiast Ram Iyer has compiled five common questions about Baldur’s Gate. I ran this question against three main models: xAI and a kind of quasi-benchmark I’ve decided to call “BaldurBench.”
In order to increase transparency in our journalism, we have made all chat records publicly available. Available on Grok, ChatGPT, Claude, and Gemini.
First of all, good news. Grok actually provides pretty good information. The answer was a bit chock full of gamer jargon – “save-scumming” instead of saving, “DPS” instead of damage – but if you knew what it was talking about, the answer was useful and well-informed. Grok also loves tables and theory crafting. You guessed it.
There are many guides for Baldur’s Gate, but the biggest difference was the style, as the models were usually drawn from the same thing. ChatGPT likes bulleted lists and sentence fragments, while Gemini likes to bold important words.
tech crunch event
boston, massachusetts
|
June 9, 2026
The biggest surprise was Claude, who was especially concerned about giving me information that would ruin my gaming experience. When I asked him about a good party composition, he concluded, “Don’t stress too much and just play what you think will be fun.” Thank you, Claude!
It’s important to keep in mind that this is a subject area where we know (thanks to Business Insider’s report) that xAI is particularly focused on achieving parity. Therefore, one should not read too much into the fact that after the reported sprint, Grok’s advice turned out to be almost the same as other models. Still, it’s nice to know that xAI can make it work if it tries.
