Elon Musk just launched Grok 3, the latest version of xAI’s LLM that was trained at the Colossus Supercluster in Memphis, Tennessee using 100,000 Nvidia H100 GPUs. He had previously said, about a week ago, that its full release was imminent and claimed that it would outperform its rivals. Today he launched the AI model via a live stream on X (formerly Twitter) showcasing impressive performance benchmark results.
Early Grok-3 benchmarks show it dominating the field. pic.twitter.com/KXubPhaA5xFebruary 18, 2025
Musk began the presentation by saying “The mission of xAI and Grok is to understand the universe,” and explaining that he wants to answer questions like, “What’s going on? Where are the aliens? What is the meaning of life? How does the universe end? How did it start?” He added, “Of course, that’s to be a maximally truth-seeking AI even if that truth is sometimes at odds with what is politically correct.”
After speaking about his goals with AI, Musk proclaimed that Grok 3 is an order of magnitude more capable than Grok 2, and that it was trained in a very short period. This was likely possible because of the massive number of GPUs xAI used for parallelized training, which also took just 19 days to set up — a record time especially since Nvidia’s CEO Jensen Huang said that that usually takes four years.
Grok 3 isn’t just a single LLM though — instead, it’s a family of several models, with the first ones launched being Grok 3 and Grok 3 mini. xAI also showed off Grok 3 Reasoning and Grok 3 mini Reasoning, which are similar to OpenAI 03-mini and DeepSeek R1 models and will solve problems through a step-by-step logical process.
Benchmarks shown by the xAI team reveal Grok-3 and Grok-3 mini models outperforming its competition, including Gemini-2 Pro, DeepSeek-V3, Claude 3.5 Sonnet, and GPT-4o, in several tests, including Math (AIME), Science (GPQA), and Coding (LCB). The reasoning models, which are accessible via the Grok app, also outperform the competition using the same benchmarks. Aside from this, the Grok app will have a new feature called DeepSearch, which scours the internet when questioned to then distill all the information into a single answer.
Other experts have been given access to Grok 3 in advance and were able to test these claims. For example, former Tesla Director of AI and OpenAI founder Andrej Karpathy shared his test results on X, saying that Grok 3 + Thinking feels similar to OpenAI’s o1-pro model while being a bit better than DeepSeek-R1 and Gemini 2.0 Flash Thinking. This is actually quite a feat, especially since OpenAI and Google have had a massive head start over xAI.
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.Thinking✅ First, Grok 3 clearly has an around state of the art thinking model (“Think” button) and did great out of the box on my Settler’s of Catan… pic.twitter.com/qIrUAN1IfDFebruary 18, 2025
Grok 3 will be available to X Premium+ subscribers first. However, those who want to access more advanced features will need to sign up for SuperGrok, which is rumored to cost around $30 a month or $300 annually.