According to the latest announcement done by Cerebral Systems, the company has claimed of creating the first brain-scale AI solution. This is a single system that can support 120-trillion parameter AI models, beating out the 100 trillion synapses present in the human brain.
For the typical AI workloads currently, the most commonly used device is the clusters of GPUs and they top out at 1 trillion parameters. Cerebral can accomplish this industry-first with a single 850,000-core system, but it can also spread workloads over up to 192 CS-2 systems with 162 million AI-optimized cores to unlock even more performance.
The Cerebral CS-2 is currently the world’s fastest AI processor known to humankind and is undoubtedly one of the most unique semiconductor devices on the planet. It has 46,225 mm2 of silicon, 2.6 trillion transistors, and 850,000 AI-optimized cores which are all packed on a single wafer-sized 7nm processor.
Each of the chips is embedded in a single CS-2 system, and still, the size can limit the size of AI models. The chip has 40 GB of on-chip SRAM memory, however, by adding a new external cabinet with additional memory the company is now able to allow the company to run larger brain-scale AI models.
Coming to scalability, providing 20 petabytes of memory bandwidth and 220 petabits of aggregate fabric bandwidth, it becomes challenging to establish communication between multiple chips which share the full workload between processors. Another challenge is the system’s extreme computational horsepower making scaling performance across multiple systems challenging
to solve this problem Cerebras’ multi-node solution takes a different approach by storing model parameters off-chip in a MemoryX cabinet while simultaneously keeping the model on-chip.
This will allow a single system to compute larger AI models but it also solves the typical latency and memory bandwidth issues that restrict scalability with groups of ‘smaller processors, like GPUs.
Cerebrus’s switches connect the systems to the memory box, coming with anywhere from 4TB to 2.4PB of memory capacity. It’s a memory mix of flash and DRAM, and this single box can store up to 120 trillion weights while having a ‘few’ x86 processors to run the software and data plane for the system.
“The last several years have shown us that, for NLP models, insights scale directly with parameters– the more parameters, the better the results,” says Rick Stevens, Associate Director, Argonne National Laboratory. “Cerebras’ inventions, which will provide a 100x increase in parameter capacity, may have the potential to transform the industry. For the first time, we will be able to explore brain-sized models, opening up vast new avenues of research and insight.”