Nvidia’s Ampere A100 HPC upgraded GPU will be even faster than the original models

More From Author

See more articles

Myntra Upcoming Sales 2025: Your Fashion Calendar for Maximum...

Myntra Upcoming Sales 2025 In the ever-evolving world of fashion e-commerce, Myntra continues to be India's go-to destination...

Dimensity 6020 vs Snapdragon 695: Mid-Range Chipset Battle

Dimensity 6020 vs Snapdragon 695: Qualcomm Snapdragon 695 5G (SD695) is a fast mid-range ARM-based SoC found...

My Jio Recharge Plans as of January 4,...

My Jio Recharge Plans: Since its establishment in 2016, Reliance Jio has made a remarkable impact on...

Nvidia’s Ampere A100 is undoubtedly the company’s fastest GPU and the latest reports claim that the manufacturer of the graphics is planning to make its GPU even faster with twice the memory capacity and record-breaking memory bandwidth.

Last year the green team introduced its NVIDIA A100 HPC accelerator and now the company is planning to give it a major spec upgrade. The chip, based on NVIDIA’s Ampere A100 GPU, houses an insane 54 billion transistors.

Coming to the specifications, the A100 PCIe GPU accelerator retains the specifications as the 250W variant with 6912 CUDA cores arranged in 108 SM units, 432 Tensor Cores, and 80 GB of HBM2e memory. It has a bandwidth of 2.0 TB/s.

Speaking on the performance side, the FP64 is rated at 9.7/19.5 TFLOPs, FP32 performance is rated at 19.5 /156/312 TFLOPs (Sparsity), FP16 performance is rated at 312/624 TFLOPs (Sparsity), and finally, the INT8 is rated at 624/1248 TOPs (Sparsity). The latest HPC accelerator will be released next week and the price is estimated to be over $20,000 US.

NVIDIA Ampere GA100 GPU Based Tesla A100 Specs:

NVIDIA Tesla Graphics CardTesla K40
(PCI-Express)
Tesla M40
(PCI-Express)
Tesla P100
(PCI-Express)
Tesla P100 (SXM2)Tesla V100 (SXM2)Tesla V100S (PCIe)NVIDIA A100 (SXM4)NVIDIA A100 (PCIe4)
GPUGK110 (Kepler)GM200 (Maxwell)GP100 (Pascal)GP100 (Pascal)GV100 (Volta)GV100 (Volta)GA100 (Ampere)GA100 (Ampere)
Process Node28nm28nm16nm16nm12nm12nm7nm7nm
Transistors7.1 Billion8 Billion15.3 Billion15.3 Billion21.1 Billion21.1 Billion54.2 Billion54.2 Billion
GPU Die Size551 mm2601 mm2610 mm2610 mm2815mm2815mm2826mm2826mm2
SMs152456568080108108
TPCs1524282840405454
FP32 CUDA Cores Per SM192128646464646464
FP64 CUDA Cores / SM644323232323232
FP32 CUDA Cores28803072358435845120512069126912
FP64 CUDA Cores96096179217922560256034563456
Tensor CoresN/AN/AN/AN/A640640432432
Texture Units240192224224320320432432
Boost Clock875 MHz1114 MHz1329MHz1480 MHz1530 MHz1601 MHz1410 MHz1410 MHz
TOPs (DNN/AI)N/AN/AN/AN/A125 TOPs130 TOPs1248 TOPs
2496 TOPs with Sparsity
1248 TOPs
2496 TOPs with Sparsity
FP16 ComputeN/AN/A18.7 TFLOPs21.2 TFLOPs30.4 TFLOPs32.8 TFLOPs312 TFLOPs
624 TFLOPs with Sparsity
312 TFLOPs
624 TFLOPs with Sparsity
FP32 Compute5.04 TFLOPs6.8 TFLOPs10.0 TFLOPs10.6 TFLOPs15.7 TFLOPs16.4 TFLOPs156 TFLOPs
(19.5 TFLOPs standard)
156 TFLOPs
(19.5 TFLOPs standard)
FP64 Compute1.68 TFLOPs0.2 TFLOPs4.7 TFLOPs5.30 TFLOPs7.80 TFLOPs8.2 TFLOPs19.5 TFLOPs
(9.7 TFLOPs standard)
19.5 TFLOPs
(9.7 TFLOPs standard)
Memory Interface384-bit GDDR5384-bit GDDR54096-bit HBM24096-bit HBM24096-bit HBM24096-bit HBM26144-bit HBM2e6144-bit HBM2e
Memory Size12 GB GDDR5 @ 288 GB/s24 GB GDDR5 @ 288 GB/s16 GB HBM2 @ 732 GB/s
12 GB HBM2 @ 549 GB/s
16 GB HBM2 @ 732 GB/s16 GB HBM2 @ 900 GB/s16 GB HBM2 @ 1134 GB/sUp To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 1.6 TB/s
Up To 40 GB HBM2 @ 1.6 TB/s
Up To 80 GB HBM2 @ 2.0 TB/s
L2 Cache Size1536 KB3072 KB4096 KB4096 KB6144 KB6144 KB40960 KB40960 KB
TDP235W250W250W300W300W250W400W250W

Source

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

━ Related News

Featured

━ Latest News

Featured