Single page Print

Nvidia's GeForce 8600 series graphics cards

DX10 goes mid-range

I DON'T ACTUALLY have to write these reviews, you know. If I get enough caffeine into my bloodstream, letters begin to swim across the page, form into words, and a stream of trenchant observations interlaced with bad jokes just kind of assembles itself in front of me. At least, that's what I think happens. After staying up 'til 4:00 AM to push one of these babies out of the door, I have little or no recollection of the process by which they get written. In fact, I was surprised to learn the other day that I reviewed the Radeon X1650 XT last October. Can't say I recall that one.

Anyhow, before I forget, the prompter of today's stimulant-enhanced activities is Nvidia's launch of a new line of DirectX 10-capable mid-range graphics cards, known as the GeForce 8600 series. In fact, Nvidia is setting loose a whole range of products, from the bargain-bin GeForce 8300 GS to the GeForce 8600 GTS, which merits at least six shots of espresso, by my reckoning. I've spent the better part of the past week testing out the GeForce 8600 GTS and its little brother, the GT, against their predecessors and natural competitors from the Radeon camp.

So how well does the unified architecture behind the killer GeForce 8800 translate into a mid-range GPU? The answer's slipped my mind, my hands are shaking, and I need to visit the men's room. But I think I've stashed some charts and graphs in the following pages, so let's see what they have to say.

Inside the G84 GPU
The GeForce 8 line of GPUs started out with the high-end GeForce 8800, which we reviewed back at its introduction. I'm about to lay a whole bunch of graphics mumbo-jumbo on you that's predicated on knowledge of the G80 GPU, so you might want to read that review if you haven't yet.

Of course, since everyone always listens to suggestions like that one, I won't have to tell you that the G80 is an all-new graphics processor with a unified shader architecture that's designed to live up to the requirements of the DirectX 10 graphics programming interface built into Windows Vista. Nor will I have to explain that the G80 uses a collection of stream processors (SPs) to handle both pixel and vertex processing, allocating computational resources as needed to meet the demands of the scene being drawn. You'll not need to be reminded that the G80 SPs are scalar in nature—that they each operate on a single pixel component, not on the usual four components in parallel. And you'll already be well aware that the G80 produces higher-quality pixels generally, from its 32-bit floating-point precision per pixel component to the wretched excess of texture filtering capacity Nvidia has deployed to deliver angle-independent anisotropic filtering by default. The G80's excellent coverage-sampled antialiasing will be old hat, as well, so you won't be impressed to hear that it delivers up to 16X sample quality with minimal slowdowns.

So I might just as well skip ahead and say that the G84 graphics processor that powers the GeForce 8600 series is based on this same basic technology, with the same capabilites, only scaled down and tweaked to better meet the needs of much less expensive graphics cards than $600 behemoths like the GeForce 8800 GTX.

Indeed, the G84 is down to a more manageable 289 million transistors, by Nvidia's estimates, which is well under half the count of the staggering 680-million-transistor G80. G84s are manufactured by TSMC on an 80nm fab process, and by my shaky measurements, the chips are roughly 169 mm². That number gives the G84 an unmistakably upper-middle-class pedigree. For reference, its predecessor, the G73 GPU powering the GeForce 7600 series, is approximately 125 mm², while the AMD RV570 chip inside the Radeon X1950 Pro is about 230 mm².

Block diagram of the G84. Source: NVIDIA.

A single SP cluster. Source: NVIDIA.

This scaled-down G80 derivative packs only two partitions of 16 stream processors, for a total of 32 SPs onboard. That's a precipitous drop from the 128 SPs in the G80, to say the least. Nvidia, though, has made provisions to keep the G84's performance acceptable in its weight class. The texturing capacity of the G80's SP partitions has been beefed up, so each texture processor can handle eight addresses per clock on the G84 instead of the four on the G80. That gives the G84 the ability to handle a total of 16 texture address ops per clock, although the ratio of texturing to filtering capacity is altered. (The texture filtering capacity of each SP remains the same as the G80 at eight bilinear filtered texels per clock.)

Nvidia claims the other performance tweaks in the G84 include improved stencil cull performance and the ever-informative "various low-level architectural tweaks," whatever those may be.

If you're attuned to more traditional graphics processors, the G84's 32 stream processors may sound like a lot. After all, the G73 chip in the GeForce 7600 has 12 shader processors and the higher-end G71 has 24. But keep in mind, as I've mentioned, that the G84's SPs are scalar, so they only operate on a single pixel component at once, while the vector units in most GPUs process four components together. One could justifiably argue that the G84's 32 SPs are the equivalent of eight traditional shader units—not very impressive. The G84 is banking on SP clocks that are about twice the typical frequencies of previous-gen GPUs and a more efficient overall architecture.

Beyond the SPs, the G84 has eight raster operators (or ROPs), so it can output a maximum of eight pixels per clock to memory. That doesn't make for impressive pixel fill rate numbers, but it should suffice. Texturing and shading are the more likely constraints these days. The two ROP partitions on the G84 each have a 64-bit path to memory, yielding a combined 128-bit memory interface—a standard-issue config for this class of GPU from Nvidia but only a third the width of the 384-bit memory bus of the G80.

Another improvement over the G80 is the G84's new "VP2" video processing unit, which includes hardware to accelerate more portions of the HD video decoding task. Nvidia says the G84's VP2 has "full acceleration for the entire H.264 decode chain," although such pronouncements are notoriously slippery. H.264 decoding involves many stages, and some chores will almost certainly fall to the CPU. Nonetheless, the G84 has new logic to assist with decoding H.264's context-adaptive encoding schemes and with decryption of 128-bit AES copy-protected content. These abilities will certainly be a welcome addition to this mid-range GPU, which is likely to find its way into systems that lack the CPU horsepower to handle high-def video processing on their own.

Unlike the G80, the G84 doesn't require a separate, external display chip; the G84 has its display output logic built in, and it's capable of driving a pair of dual-link DVI connections simultaneously, each at a maximum 2560x1600 resolution. HDCP support is also included, since so many of us fear for the safety of the content we've purchased.