Press "Enter" to skip to content

Tesla’s Dojo is impressive, but it won’t transform supercomputing

Dojo looks good at first glance, but its reported performance numbers and niche purpose place it outside the ranks of true supercomputers.

Image: Tesla

Tesla’s AI Day event included the reveal of several new potential products, and while bipedal robots are exciting (if perhaps a bit unrealistic), the real news to follow is the reveal of Tesla’s new in-house designed supercomputer, called Dojo.

More about Innovation

To call Dojo a full-fledged supercomputer is a bit generous, though: It hasn’t been fully assembled yet, and its potential performance limits have yet to be tested. What Tesla promises, though, is nothing short of a supercomputing breakthrough. 

The most powerful supercomputer in the world, Fugaku, lives at the RIKEN Center for Computational Science in Japan. At its tested limit it is capable of 442,010 teraflops (TFLOP) per second, and theoretically it could perform up to 537,212 TFLOPs per second. Dojo, Tesla said, could end up being capable of breaking the exaflop barrier, something that no supercomputing company, university or government has been capable of doing. 

SEE: Hiring Kit: Video Game Programmer (TechRepublic Premium)

Putting that claim in perspective means understanding the scale and capabilities of Dojo and other supercomputers.

First, Dojo is designed to do one particular thing: train artificial intelligence. Tesla is building Dojo for use in-house, processing video data from the millions of Tesla vehicles on the road. Dojo is built on Tesla’s D1 chip, the second the company has designed. The chip is built using seven-nanometer technology and is independently capable of 362 TFLOPs per second. 

Dojo chips don’t operate individually, however, and the smallest unit Tesla has built is what it calls Dojo training tiles. These tiles are a connection of 500,000 nodes that are reportedly capable of performing nine petaflops per second (1 PFLOP = 1,000 TFLOPs). All of that incredible speed is available in a tile less than one cubic foot in size. 

Tesla’s plans for Dojo tiles is to network them into larger systems. Its first design goal is to build a computer cabinet capable of housing two trays, each containing six Dojo tiles. In that configuration, Tesla said, it would be able to handle 100+ PFLOPs per second. Beyond that, Tesla plans to build what it calls an ExaPOD consisting of 10 Dojo cabinets that will be able to perform 1.1 exaflops (1 EFLOP = 1,000 PFLOPs). 

Fugaku, on the other hand, takes up an entire room, and at its peak measured performance is capable of 442 PFLOPs. 

Back to that earlier perspective: Tesla believes that it will be capable of doubling the performance of Fugaku with 422 fewer racks, and it will be able to do that by next year despite only having reached the milestone of building and testing a single tile. 

The reality of Tesla’s Dojo claims

Dojo’s reported capabilities don’t grant it true high-performance computer (HPC) status, said Gartner research vice president Chirag Dekate, largely because it hasn’t been tested using the same standards as Fugaku and other supercomputers. 

“The Tesla Dojo is an AI-specific supercomputer designed to accelerate machine learning and deep learning activities. Its lower precision focus limits applicability to a broader HPC context,” Dekate said.

The measurements provided by Tesla indicate that Dojo’s impressive speeds were measured using three standards: BF16, CFP8 and FP32, each of which indicate the amount of bits that an equation occupies in the computer’s memory. 

SEE: Digital transformation: A CXO’s guide (free PDF) (TechRepublic)

“For the most part, HPC applications rely on higher-order precision (FP64) than the ones supported by Dojo, which is more designed for extreme-scale deep learning and machine learning tasks,” Dekate said. 

All of this isn’t to say that what Tesla has developed with Dojo isn’t impressive: It could prove to be an industry leader in machine learning training. That said, calling it a supercomputer and claiming it will break the exaflop barrier may be a more difficult sell when everyone else is rating their systems using standards that are twice as complicated. 

Also see

Source: TechRepublic