Putting those new shaders to work
| Peak shader
|GeForce GTX 280||0.6||142|
|GeForce GTX 480||1.3||177|
|GeForce GTX 580||1.6||192|
|Radeon HD 5870||2.7||154|
|Radeon HD 6970||2.7||176|
|Radeon HD 7970||3.8||264|
The first couple of tests above, the cloth and particles simulations, primarily use vertex and geometry shaders to do their work. In those tests, the 7970 easily outperforms the 6970, but it's not quite as fast as the two Fermi-based GeForces. As we've noted, vertex processing remains a strength of Nvidia's architecture.
Boy, things turn around in a hurry once we move into the last three tests, which rely on pixel shader throughput. True to form, AMD's older GPUs tend to outrun the GeForces in these tests, since they're quite efficient with pixel-centric workloads. Even so, Tahiti is substantially faster. In a couple of cases, the 7970 delivers on its potential to crank out over twice the FLOPS of the GeForce GTX 580.
GPU computing performance
These results are instructive. When we move from pixel shaders into DirectCompute performance, the Fermi-based GeForces recapture the lead from the Cypress- and Cayman-based Radeons. The Radeons have much higher theoretical FLOPS peaks, but the GeForces tend to be more efficient here. Tahiti, though, changes the dynamic. The Radeon HD 7970 outruns the GTX 580 and is nearly 50% faster than the Cypress-based Radeon HD 5870.
LuxMark is a ray-traced rendering test that uses OpenCL to harness any compatible processor to do its work. As you can see, we've even included the Core i7-980X CPU in our test system as a point of comparison. Obviously, though, the 7970 is the star of this show. The newest Radeon nearly doubles the throughput of its elder siblings—and nearly triples the performance of the Fermi-based GeForces. We've only run a couple of GPU computing tests, so our results aren't the last word on the matter, but Tahiti may be the best GPU computing engine out there. AMD appears to have combined two very desirable traits in this chip's shader array: much higher utilization (and thus efficiency) than previous DX11-class Radeons, and gobs of FLOPS in the given chip area.
|The TR Podcast 173: Torquing the Titan||3|
|A fresh look at storage performance with PCIe SSDs||9|
|Leaked specs detail Intel's 14-nm Braswell SoCs||11|
|Here are our musings on the new MacBook||103|
|Microsoft unveils Atom-powered Surface 3 tablet||52|
|Source code references hint at Tegra X1 Chromebooks||1|
|Samsung's 850 EVO M.2 solid-state drive reviewed||25|
|New Windows 10 build includes Project Spartan browser||62|
|GeForce Experience update streamlines GameStream setup||10|
|THIS IS THE INTERNET. THERE IS NO PLACE FOR FUN DISCUSSION.||+35|