AMD has finally announced the next-generation CDNA GPU-based Instinct MI100 accelerator. The company calls it the fastest HPC GPU in the world. It will be utilizing the CDNA architecture, which is different than the RDNA. The new architecture is specifically designed for the HPC segment.
The card will feature a 7nm CDNA GPU and packs 120 Compute Units or 7680 stream processors. The CDNA GPU powering instinct MI100 measures at around 720mm2. The GPU has a clock speed configured around 1500 MHz and delivers a peak performance throughput of 11.5 TFLOPs in FP64, 23.1 TFLOPs in FP32, and will feature a total power draw of 300 Watts. The card will come with 4 and 8 GPU configurations and a rated bandwidth of 276 GB/s.
With the new AMD Matrix Core technology, the MI100 also delivers a nearly 7x boost in FP16 theoretical peak floating-point performance for AI training workloads compared to AMD’s prior generation accelerators.
This new GPU from AMD does promise up to 2.1x more performance per dollar compared to the popular NVIDIA A100.
The known features so far include
Accelerator Name | AMD Radeon Instinct MI6 | AMD Radeon Instinct MI8 | AMD Radeon Instinct MI25 | AMD Radeon Instinct MI50 | AMD Radeon Instinct MI60 | AMD Radeon Instinct MI100 |
GPU Architecture | Polaris 10 | Fiji XT | Vega 10 | Vega 20 | Vega 20 | Arcturus |
GPU Process Node | 14nm FinFET | 28nm | 14nm FinFET | 7nm FinFET | 7nm FinFET | 7nm FinFET |
GPU Cores | 2304 | 4096 | 4096 | 3840 | 4096 | 7680 |
GPU Clock Speed | 1237 MHz | 1000 MHz | 1500 MHz | 1725 MHz | 1800 MHz | ~1500 MHz |
FP16 Compute | 5.7 TFLOPs | 8.2 TFLOPs | 24.6 TFLOPs | 26.5 TFLOPs | 29.5 TFLOPs | 185 TFLOPs |
FP32 Compute | 5.7 TFLOPs | 8.2 TFLOPs | 12.3 TFLOPs | 13.3 TFLOPs | 14.7 TFLOPs | 23.1 TFLOPs |
FP64 Compute | 384 GFLOPs | 512 GFLOPs | 768 GFLOPs | 6.6 TFLOPs | 7.4 TFLOPs | 11.5 TFLOPs |
VRAM | 16 GB GDDR5 | 4 GB HBM1 | 16 GB HBM2 | 16 GB HBM2 | 32 GB HBM2 | 32 GB HBM2 |
Memory Clock | 1750 MHz | 500 MHz | 945 MHz | 1000 MHz | 1000 MHz | 1200 MHz |
Memory Bus | 256-bit bus | 4096-bit bus | 2048-bit bus | 4096-bit bus | 4096-bit bus | 4096-bit bus |
Memory Bandwidth | 224 GB/s | 512 GB/s | 484 GB/s | 1 TB/s | 1 TB/s | 1.23 TB/s |
Form Factor | Single Slot, Full Length | Dual Slot, Half Length | Dual Slot, Full Length | Dual Slot, Full Length | Dual Slot, Full Length | Dual Slot, Full Length |
Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling | Passive Cooling |
TDP | 150W | 175W | 300W | 300W | 300W | 300W |
“Today AMD takes a major step forward in the journey toward exascale computing as we unveil the AMD Instinct MI100 – the world’s fastest HPC GPU,” said Brad McCredie, corporate vice president, Data Center GPU and Accelerated Processing, AMD. “Squarely targeted toward the workloads that matter in scientific computing, our latest accelerator, when combined with the AMD ROCm open software platform, is designed to provide scientists and researchers a superior foundation for their work in HPC.”
Do check out: