Well. The run-up to the release of this here graphics card has certainly been unusual. AMD revealed a bunch of details about the Radeon R9 290X and the new "Hawaii" chip on which it's based at a press event one month ago. As a result, most folks have had a pretty good idea what to expect for a while. Today, the 290X should be up for sale at online retailers, and it's finally time for us to review this puppy. Let's have a look, shall we?
Say aloha to Hawaii
Not only is the Radeon R9 290X a beefy graphics card intended to compete with the likes of the GeForce GTX 780, but it's also something else: the platform for a truly new chip, with updated technology inside. Most of the rest of the cards in the Radeon R7 and R9 series introduced recently are renamed and slightly tweaked cards based on existing silicon. Not so here. The Hawaii GPU that powers the 290X represents the next generation of GPU technology from AMD, with a number of incremental improvements over the last gen.
The main thing Hawaii is, though, is bigger: larger than the Tahiti chip in the Radeon HD 7970 (and R9 280X), with more of everything that matters. As you can see in the table above, Hawaii has very high counts of every key graphics resource. In fact, Hawaii matches up well on paper against the GK110 chip that drives the GeForce GTX 780 and Titan, even though it's over 100 mm² smaller in terms of die area—and both of those GPUs are manufactured on the same 28-nm fab process at TSMC.
At its core, Hawaii is based on familiar tech: the Graphics Core Next architecture first introduced in the Radeon HD 7000 series. However, this is the next iteration of GCN, with some minor tweaks to the compute units and larger changes elsewhere. Also, AMD has overhauled the layout in this GPU in order to ensure the right performance balance at its larger scale.
The chip's graphics processing resources are broken down into four separate "shader engines," each one almost an independent GPU unto itself. Graphics tasks are load balanced between the four engines. Each shader engine has its own geometry processor and rasterizer, effectively doubling the primitive rasterization rate versus Tahiti. That upgrade should improve performance when there are more polygons onscreen, particularly with higher levels of tessellation. In addition, the geometry units have been tweaked to improve data flow, which makes sense. By all accounts, the geometry amplification that happens during tessellation remains a hard problem for GPUs to handle.
If you've been following these things for even a little while, looking at these shader engines will make you feel old. Each one of them has four render back ends capable of blending and outputting 16 pixels per clock cycle. That was pretty much a whole GPU's worth of pixel fill and antialiasing power back in the day. And by "the day," I mean two weeks ago, when we reviewed the Radeon R7 260X. Each shader engine also has 11 of the GCN compute units that give the GPU its number-crunching power. Every CU has four 16-wide vector math units. That works out to 704 "shader processors" per engine, again almost enough to match the scale of a mid-range GPU.
Now, multiply everything in the above paragraph by four, and you've got Hawaii, with a total of 2816 shader processors and 64 pixels per clock of ROP power. At an even 1GHz, Hawaii is capable of 5.6 teraflops of single-precision compute, making it easily the new leader in consumer graphics chips. For compute-focused applications, it can handle double-precision floating-point math at one quarter that rate, still well in excess of a teraflop.
All of this computing power is backed by a 1MB L2 cache. This cache is fully read/write capable and is divided into 16 partitions of 64KB each. The L2's capacity is a third larger than Tahiti's 768KB L2, and AMD says bandwidth is up by a third, as well. The firm claims Hawaii's L1 and L2 caches can exchange data as fast as one terabyte per second, which is more than I can say for my USB 3.0 drive dock.
Oddly enough, the most intriguing thing about Hawaii's basic architecture may be a fairly straightforward engineering tradeoff. The chip has eight 64-bit memory interfaces onboard, giving it, effectively, a 512-bit-wide path to memory. In order to make that wide memory path practical while keeping the chip size in check, the Hawaii team chose to exchange the complex memory PHYs in Tahiti for smaller, simpler ones. Complex PHYs, or physical interface devices, are necessary to drive GDDR5 DRAMs at peak clock frequencies, but they also eat up silicon space. AMD claims Hawaii's 512-bit memory interface occupies 20% less die area than Tahiti's 384-bit interface. As a result, Hawaii's memory operates at lower speeds. The 290X's GDDR5 runs at 5 GT/s, down from 6 GT/s for Tahiti-based cards like the Radeon HD 7970 GHz Edition. Still, overall memory bandwidth is up from 288 GB/s on Tahiti to 320 GB/s with Hawaii, thanks the wider data path.
Of course, Hawaii's advantage of this front extends beyond Tahiti. Nvidia chose a 384-bit interface and 6 GT/s memory rates for its competing GK110 chip, too.