Single page Print

Sizing 'em up
Do the math involving the clock speeds and per-clock potency of the latest high-end graphics cards, and you'll end up with a comparative table that looks something like this:

Peak pixel
fill rate
(Gpixels/s)
Peak
bilinear
filtering
int8/fp16
(Gtexels/s)
Peak
rasterization
rate
(Gtris/s)
Peak
shader
arithmetic
rate
(tflops)
Memory
bandwidth
(GB/s)
Asus R9 290X 67 185/92 4.2 5.9 346
Radeon R9 Fury X 67 269/134 4.2 8.6 512
GeForce GTX 780 Ti 37 223/223 4.6 5.3 336
Gigabyte GTX 980 85 170/170 5.3 5.4 224
GeForce GTX 980 Ti 95 189/189 6.5 6.1 336
GeForce Titan X 103 206/206 6.5 6.6 336

Those are the peak capabilities of each of these cards in theory. As I noted in my article on the Fiji GPU architecture, the Fury X is particularly strong in several departments, including memory bandwidth and shader rates, where it substantially outstrips both the R9 290X and the competing GeForce GTX 980 Ti. In other areas, the Fury X's theoretical graphics rates haven't budged compared to the 290X, including the pixel fill rate and rasterization. Those are also precisely the areas where the Fury X looks weakest compared to the competition. We are looking at a bit of asymmetrical warfare this time around, with AMD and Nvidia fielding vastly different mixes of GPU resources in similarly priced products.

Of course, those are just theoretical peak rates. Our fancy Beyond3D GPU architecture suite measures true delivered performance using a series of directed tests.

The Fiji GPU has the same 64 pixels per clock of ROP throughput as Hawaii before it, so these results shouldn't come as a surprise. These numbers illustrate something noteworthy, though. Nvidia has grown the ROP counts substantially in its Maxwell-based GPUs, taking even the mid-range GM204 aboard the GTX 980 beyond what Hawaii and Fiji offer. Truth be told, both of the Radeons probably offer more than enough raw pixel fill rate. However, these results are a sort of proxy for other types of ROP power, like blending for multisampled anti-aliasing and Z/stencil work for shadowing, that can tax a GPU.

This bandwidth test measures GPU throughput using two different textures: an all-black surface that's easily compressed and a random-colored texture that's essentially incompressible. The Fury X's results demonstrate several things of note.

The 16% delta between the black and random textures shows us that Fiji's delta-based color compression does it some good, although evidently not as much good as the color compression does on the Maxwell-based GeForces.

Also, our understanding from past reviews was that the R9 290X was limited by ROP throughput in this test. Somehow, the Fury X speeds past the 290X despite having the same ROP count on paper. Hmm. Perhaps we were wrong about what limited the 290X. If so, then 290X may have been bandwidth limited, after all—and Hawaii apparently has no texture compression of note. The question then becomes whether the Fury X is also bandwidth limited in this test, or if its performance is limited by the render back-end. Whatever the case, the Fury X "only" achieves 387 GB/s of throughput here, well below the 512 GB/s theoretical max of its HBM-infused memory subsystem. Ominously, the Fury X only leads the GTX 980 Ti by the slimmest of margins with the compressible black texture.

Fiji has a ton of texture filtering capacity on tap, especially for simpler formats. The Fury X falls behind the GTX 980 Ti when filtering texture formats that are 16 bits per color channel, though. That fact will matter more or less depending on the texture formats used by the game being run.

The Fury X achieves something close to its maximum theoretical rate in our polygon throughput test, at least when the polygons are presented in a list format. However, it still trails even the Kepler-based GeForce GTX 780 Ti, let alone the newer GeForces. Adding tessellation to the mix doesn't help matters. The Fury X still manages just over half the throughput of the GTX 980 Ti in TessMark.

Fiji's massive shader array is not to be denied. The Fury X crunches through its full 8.6 teraflops of theoretical peak performance in our ALU throughput test.

At the end of the day, the results from these directed tests largely confirm the major contrasts between the Fury X and the GeForce GTX 980 Ti. These two solutions have sharply divergent mixes of resources on tap, not just on paper but in terms of measurable throughput.