Elon Musk reveals photos of Dojo D1 Supercomputer cluster — roughly equivalent to 8,000 Nvidia H100 GPUs for AI training

Source: Tom's Hardware added 24th Jul 2024

Fresh off firing up the Memphis Supercluster, claimed to be “the most powerful AI cluster in the world,” Elon Musk has now shared pictures of a supercomputer cluster that uses his own homegrown Dojo AI accelerators. He also announced on the Tesla earnings call that he would double down on Dojo development and deployment due to the high pricing of Nvidia’s GPUs.

Aside from the opening of the xAI facility in Tennessee, which aims to have 100,000 Nvidia H100 GPUs on a single fabric, Musk said that he will have Dojo D1 up and running by the end of the year. As Musk said, it would have the processing power of 8,000 of Nvidia’s H100 chips, which is “Not massive, but not trivial either.”

Musk first unveiled the Dojo D1 chip in 2021 with a performance target of 322 TeraFLOPs of power. Then, in August last year, Tesla was spotted hiring a Senior Engineering Program Manager for Data Centers, which is usually one of the first steps any organization would take when planning its own data centers. Tesla also doubled its orders for the Dojo D1 the following month, which shows its confidence in its performance.

By May 2024, it was reported that the Dojo processor was already in mass production. Now, it seems that the Dojo chips have already made their way to the States and into Elon’s hands, and yesterday he shared pictures of the Dojo Supercomputer at their home in the data center.

And Dojo 1 will have roughly 8k H100-equivalent of training online by end of year.Not massive, but not trivial either.July 23, 2024

The Dojo chips are system-on-wafer processors with a 5-by-5 array. This means its 25 ultra-high-performance dies are interconnected using TSMC’s integrated fan-out (InFO) technology) so they can act as a single processor and perform more efficiently than similar multi-processor machines.

TSMC manufactures Dojo chips for Tesla, and Musk will run them alongside his Nvidia-powered Memphis Supercluster. However, while the Tennessee facility is owned by xAI and is primarily used for training Grok, the Dojo chips are more tuned for AI machine learning and video training, especially as they will be used to train Tesla’s Full Self-Driving technology based on the video data gathered from Tesla cars.

When Musk combined all the chips he has on hand, he said that he’d have 90,000 Nvidia H100 chips, 40,000 Nvidia AI4, and the Dojo D1 wafers running by the end of 2024. This substantial computing power shows how much effort and resources the billionaire is pouring into artificial intelligence.

Get Tom’s Hardware’s best news and in-depth reviews, straight to your inbox.

Read the full article at Tom's Hardware

media: Tom's Hardware

Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 88

Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 88

Related Products

Notice: Undefined variable: all_related in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 91

Warning: Invalid argument supplied for foreach() in /var/www/vhosts/rondea.com/httpdocs/wp-content/themes/rondea-2-0/single-article.php on line 91

Elon Musk reveals photos of Dojo D1 Supercomputer cluster — roughly equivalent to 8,000 Nvidia H100 GPUs for AI training

Related posts

Related Products