Who could blame us, really, for being impatient? The GeForce 8800 is a stunning achievement, and we're eager to see whether AMD can match it. You'll have to forgive the most eager among us, the hollow-eyed Radeon fanboys inhabiting the depths of our forums, wandering aimlessly while carrying their near-empty bottles of X1000-series eye candy and stopping periodically to endure an episode of the shakes. We've all heard the stories about AMD's new GPU, code-named R600, and wondered what manner of chip it might be. We've heard whispers of jaw-dropping potential, butespecially as the delays piled updoubts crept in, as well.
Happily, R600 is at last ready to roll. The Radeon HD 2900 XT graphics card should hit the shelves of online stores today, and we have spent the past couple of weeks dissecting it. Has AMD managed to deliver the goods? Keep reading for our in-depth review of the Radeon HD 2900 XT.
Into the R600
The R600 is easily the biggest technology leap from the Radeon folks since the release of the seminal R300 GPU in the Radeon 9700, and it's also the first Radeon since that represents a true break from R300 technology. That's due in part to the fact that R600 is designed to work in concert with Microsoft's DirectX 10 graphics programming interface, which modifies the traditional graphics pipeline to unlock more programmability and flexibility. As a state-of-the-art GPU, the R600 is also a tremendously powerful parallel computing engine. We're going to look at some aspects of R600 in some detail, but let's start with an overview of the entire chip, so we have a basis for the rest of our discussion.
The R600's most fundamental innovation is the introduction of a unified shader architecture that can process the three types of graphics programspixel shaders, vertex shaders, and geometry shadersestablished by DX10's Shader Model 4.0 using a single type of processing unit. This arrangement allows for dynamic load balancing between these three thread types, making it possible for R600 to bring the majority of its processing power to bear on the most urgent computational need at hand during the rendering of a frame. In theory, a unified shader architecture can be vastly more efficient and effective than a GPU with fixed shader types, as all DX9-class (and prior) desktop GPUs were.
A high-level diagram of the R600 architecture like the one above will no doubt invoke memories of ATI's first unified shader architecture, the Xenos GPU inside the Xbox 360. The basic arrangement of functional units looks very similar, but R600 is in fact a new and different design in key respects like shader architecture and thread dispatch. One might also wish to draw parallels to the unified shader architecture of Nvidia's G80 GPU, but the R600 arranges its execution resources quite differently from G80, as well. In its GeForce 8800 GTX incarnation, the G80 has 128 scalar stream processors running at 1.35GHz. The R600 is more parallel and runs at lower frequencies; AMD counts 320 stream processors running at 742MHz on the Radeon HD 2900 XT. That's not an inaccurate portrayal of the GPU's structure, but there's much more to it than that, as we'll discuss briefly.
First, though, let's have a look at the R600 chip itself, because, well, see for yourself.
Like the G80, it's frickin' huge. With the cooler removed, you can see it from space. AMD estimates the chip at 700 million transistors, and TSMC packs those transistors onto a die using an 80nm fab process. I measured the R600 at roughly 21 mm by 20 mm, which works out to 420 mm².
I'd like to give you a side-by-side comparison with the G80, but that chip is typically covered by a metal cap, making pictures and measurements difficult. (Yes, I probably should sacrifice a card for science, but I haven't done it yet.) Nvidia says the G80 has 680 million transistors, and it's produced on a larger 90nm fab process at TSMC. I've seen die size estimates for G80 that range from roughly 420 to 490 mm², although Nvidia won't confirm exact numbers. R600, however, doesn't have to rely on a separate chip to provide display logic, so it's almost certainly smaller overall.