23.1 C
Delhi

It’s proven! AMD’s RDNA 2 has better latency than Nvidia’s Ampere

According to recent sources, folks at Chips and Cheese conducted a recent GPU memory latency performance test on the AMD’s rDNA 2 & NVIDIA’s Ampere GPU architectures. The results produced were more than interesting.

Latency performance has become a very crucial factor in the ever-increasing use of multi-chipset dies and several IO chips onboard the same die. So, the performance of the two GPUs gives us a glance at their actual performance capabilities.

Coming to the results, AMD’s Radeon RX 6800 XT (RDNA 2 GPU) & the NVIDIA GeForce RTX 3090 (Ampere GPU) were positioned against each other. According to the test, the cache and memory benchmark of the AMD’s rDNA 2 architecture fared far better than NVIDIA’s Ampere GPU. It delivered lower latency despite having to check two more levels of cache on the way to the memory. The use of Infinity cache only adds 20ns over L2 hit and is still faster than NVIDIA’s Ampere.

The testers concluded that NVIDIA’s Ampere-based GA102 GPU is simply much larger and uses a more conventional GPU memory subsystem with only two cache levels, it has to take a lot of cycles and results in over 100ns latency (L1 to L2). RDNA 2 on the other hand has a latency of just 66ns.

- Advertisement -TechnoSports-Ad

We know that AMD’s Navi 21 GPU features a 4 MB L2 cache while the NVIDIA GA102 GPU features a 6 MB L2 cache for the whole chip. The NVIDIA A100 Ampere GPU for HPC features a massive 40 MB L2 cache.

The folks at Chip and Cheese had this to say about the results:

RDNA 2’s cache is fast and there’s a lot of it. Compared to Ampere, latency is low at all levels. Infinity Cache only adds about 20 ns over an L2 hit and has lower latency than Ampere’s L2. Amazingly, RDNA 2’s VRAM latency is about the same as Ampere’s, even though RDNA 2 is checking two more levels of cache on the way to memory.

- Advertisement -TechnoSports-Ad

In contrast, Nvidia sticks with a more conventional GPU memory subsystem with only two levels of cache and high L2 latency. Going from Ampere’s SM-private L1 to L2 takes over 100 ns. RDNA’s L2 is ~66 ns away from L0, even with an L1 cache between them. Getting around GA102’s massive die seems to take a lot of cycles.

This could explain AMD’s excellent performance at lower resolutions. RDNA 2’s low latency L2 and L3 caches may give it an advantage with smaller workloads, where occupancy is too low to hide latency. Nvidia’s Ampere chips in comparison require more parallelism to shine.

Source

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Nivedita Bangari
Nivedita Bangari
I am a software engineer by profession and technology is my love, learning and playing with new technologies is my passion.
TechnoSports-Ad

Popular

TechnoSports-Ad

Related Stories

More from author

The list of Airtel SMS packs as of April 23, 2024

Check out the list of Airtel SMS packs, including costs and validity information. We have shared a list of Airtel SMS recharge plans that...

My Jio Recharge Plans as of April 23, 2024: Top trending plans from Jio

My Jio Recharge Plans: Since its establishment in 2016, Reliance Jio has made a remarkable impact on the Indian te­lecommunications industry. The company has...

The Best Recharge Plan for Jio as of 23rd April 2024

Best Recharge Plan for Jio in 2024: The Ultimate Guide In the past few months, Jio has introduced and tweaked a slew of new...

Best RTX 4070 Gaming Laptops in India as of 2024

The top-performing RTX 4070 Gaming Laptops available in India in 2024 are equipped with highly capable CPUs, graphics cards, and memory. These laptops not...