The Radeon RX 7900 series, based on the Navi 31 GPUs, is AMD’s top RDNA 3 product. While AMD positioned Navi 31 against NVIDIA’s RTX 4080, the final performance was slightly unimpressive and did not cause the massive disruption that many expected. According to speculations, the cause for such poor performance could be owing to an incomplete GPU silicon that delivered on the flagship cards.
According to Kepler L2, early RDNA 3 silicon contained a non-functioning shader prefetch HW. This was present in three chips: the GFX1100 (Navi 31), the GFX1102 (Navi 33), and the GFX1103 (the APU lineup consisting of Phoenix chips). Based on the most recent GitHub submission. According to Kepler, Navi 32 GPUs are based on the ‘GFX1103’ IP, whilst Navi 33 processors are based on the ‘GFX1102’ IP. Except for Navi 32, which will be used in multiple mainstream discrete GPUs on desktop and mobility devices by early 2023, the same flaws can be seen on every other chip.
According to Kepler, this cannot be rectified or revised in a few weeks and will take several months if AMD intends to address it at all.
If updated silicon with a patch is released a few months later, the bulk of gamers will have already purchased the early iteration of the Radeon RX 7900 series with the unfinished silicon. AMD may prepare an upgrade a year later to remedy these concerns, but for the time being, the Radeon RX 7900 series may have to rely heavily on driver-level optimizations to handle the incomplete GPU nature of their top RDNA 3 silicon.
But that’s not all; other prominent innovations, like as the VOPD instructions found on RDNA 3 GPUs, claimed to offer a significant speed boost, but in actuality, they only managed a 4% improvement over RDNA 2 in ray tracing titles. By leveraging 64 multi-precision & multi-purpose ALUs implemented over two SIMD32 units, the RDNA 3 silicon was built to handle dual Wave32 instructions for twice the floating point performance. Speaking with an AMD representative, HardwareTimes was able to confirm that AMD is currently fine-tuning the performance here and expects optimizations along the way:
Wave64 natively can access the new ALUs for 2x execution rates to unlock performance during dense ALU code execution. For Wave32 mode, the compiler does localized reordering and packing of instructions into the VOPD encoding. An RT test scene using VOPD encodings provided approximately a 4% increase in frames per second by removing the ALU bottleneck. We expect to see further improvements as the compiler matures with more optimizations for mapping code sequences to the VOPD encodings. And with the advances happening in the use of AI, RT, and compute-driven rendering techniques for more life-like rendering, we expect to see codes bound by ALU that will exploit these new ALUs more and more.
Reviewers who were able to get a first peek at numerous custom AMD Radeon RX 7900 series graphics cards noticed irregular and variable clock speeds. On review day, the NVIDIA Ada Lovelace architecture swept by RDNA 3 GPUs in all power efficiency parameters, while AMD predicted up to +54% gain in power efficiency.
- God of War Live-Adaptation is heading to Amazon Prime Video
- Tata Group IPO and the ever-rising Indian business Behemoth