NVIDIA rightfully dominates the scene in the AI discussion. their GPUs are ready to go and enjoy precedence among professionals and companies looking to enter the world of AI. However, just this week, both Intel and AMD tweaked their software stacks to deliver significant speed increases in the generative AI space. These updates resulted in AMD’s RTX 7900 XTX offering a better performance ratio per dollar invested than an NVIDIA RTX 4080 in the context of generative AI, especially when using Stable Diffusion with A111/Xformers.
Note: Tuning for GenAI, much like tuning for crypto mining performance, will have mileage vary significantly depending on the model/configuration being used. This article is about the most common A111 Xformers config (you can get a running tally of average performance by GPU here: https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html) but there *are* hyper tuned boutique optimizations where the NVIDIA RTX 4080 is faster still.
Using Microsoft Olive and DirectML instead of the PyTorch path could increase the speed of the AMD 7900 XTX from 1.87 iterations per second to 18.59 iterations per second. AMD’s detailed instructions for this can be viewed. The level of performance achieved in Automatic111 could have similarities to Stable Diffusion’s SHARK-based approach and will undoubtedly solidify the company’s presence in the generative AI space. It should also be apparent with this that the 7900 XTX also has potentially better GenAI performance per dollar in Stable Diffusion /A111 compared to the similar RTX 4080, at least under current pricing.
The lowest priced NVIDIA RTX 4080 available at Newegg at the current time (as of 8/19/2023) is the MSI Ventus GeForce RTX 4080 16GB. The least expensive AMD Radeon 7900 XTX, which is also available at Newegg, is the MSI Gaming Radeon RX 7900 XTX 24GB. Before analyzing the numbers, however, it should be noted that the AMD approach might require a bit more technical understanding than the NVIDIA approach. The AMD approach uses Microsoft Olive instead of PyTorch, and most automated installers might not install the necessary dependencies on their own. So, when considering the convenience factor, NVIDIA might remain the more straightforward choice. Still, professionals and small businesses could usually overcome the initial installation challenges, especially if the cost base is attractive enough – which could be the case here.
As can be seen, AMD silicon is finally starting to shine in the GenAI space. As a result, it is now possible to achieve a higher value than the RTX 4080 in Stable Diffusion A111. The AMD 7900 XTX is said to achieve a speed of 18.59 iterations per second at a cost of $52.1 per iteration, while the NVIDIA RTX 4080 achieves a speed of 19.41 iterations per second at a cost of $56.6 per iteration. With the less common SHARK implementation, the Radeon 7900 XTX’s price-to-performance ratio could even be as low as $46.6 per iteration. AMD is potentially a serious contender for consumers who have an interest in generative AI.
This could also mean that AMD could grow into a significant competitor to NVIDIA’s AI ambitions if given proper attention. While the majority of people don’t have LLMs in their basements, GenAI as well as SLMs/ULMs could become more ubiquitous and an important part of numerous productivity workflows in the next 12 months. How Intel and AMD position themselves in a market where NVIDIA holds a significant lead could determine how they fare in an AI-dominated world.
Source: AMD
7 Antworten
Kommentar
Lade neue Kommentare
Veteran
Urgestein
Urgestein
Urgestein
Veteran
Urgestein
Alle Kommentare lesen unter igor´sLAB Community →