Single page Print

Shader and geometry processing performance

Peak shader
arithmetic
(GFLOPS)
Peak
rasterization
rate
(Mtris/s)
Peak
memory
bandwidth
(GB/s)
GeForce GTX 460 768MB 941 1400 88.3
GeForce GTX 460 1GB 810MHz 1089 1620 124.8
GeForce GTX 470 GC 1120 2500 133.9
GeForce GTX 480 1345 2800 177.4
GeForce GTX 570 1405 2928 152.0
GeForce GTX 580 1581 3088 192.0
Radeon HD 5870 2720 850 153.6
Radeon HD 6850 1517 790 128.0
Radeon HD 6870 2016 900 134.4
Radeon HD 5970 4640 1450 256.0

By virtue of its higher clock speeds, the GTX 570 has somewhat better peak shader arithmetic and triangle rasterization rates than the GeForce GTX 480. As always, the vast SIMD arrays in the Radeon GPUs yield some eye-popping numbers for peak arithmetic rates. At the same time, Nvidia's DX11 GPUs tend to have higher geometry throughput, as represented here by rasterization rates (though there's really more to it than just those.)

The first tool we can use to measure delivered pixel shader performance is ShaderToyMark, a pixel shader test based on six different effects taken from the nifty ShaderToy utility. The pixel shaders used are fascinating abstract effects created by demoscene participants, all of whom are credited on the ShaderToyMark homepage. Running all six of these pixel shaders simultaneously easily stresses today's fastest GPUs, even at the benchmark's relatively low 960x540 default resolution.

Up next is a compute shader benchmark built into Civilization V. This test measures the GPU's ability to decompress textures used for the graphically detailed leader characters depicted in the game. The decompression routine is based on a DirectX 11 compute shader. The benchmark reports individual results for a long list of leaders; we've averaged those scores to give you the results you see below.

Finally, we have the shader tests from 3DMark Vantage.


Clockwise from top left: Parallax occlusion mapping, Perlin noise,
GPU cloth, and GPU particles

Overall, the GTX 570 looks quite strong, as expected, trailing only the GeForce GTX 580 among the single-GPU configs. Obviously, the Radeons sweep the first two 3DMark Vantage shader tests, which are focused on pixel shaders and seem to map well to the wide SIMD machines AMD produces. Otherwise, though, Nvidia's largest GPUs tend to outperform today's smaller Radeons.

Geometry processing throughput
The most obvious area of divergence between the current GPU architectures from AMD and Nvidia is geometry processing, which has become a point of emphasis with the advent of DirectX 11's tessellation feature. We can measure geometry processing speeds pretty straightforwardly with a couple of tools. The first is the Unigine Heaven demo. This demo doesn't really make good use of additional polygons to increase image quality at its highest tessellation levels, but it does push enough polys to serve as a decent synthetic benchmark.

We can push into even higher degrees of tessellation using TessMark's multiple detail levels.

GPUs based on Nvidia's DirectX 11-class Fermi architecture tend to hold up well under the most demanding geometry processing loads, and the GTX 570 in particular pretty much aces these tests. Few modern games make use of tessellation to the degree that these synthetic benchmarks do, however.