Single page Print

Nvidia's GeForce 8800 GT graphics processor

Mid-range marvel

This is an absolutely spectacular time to be a PC gamer. The slew of top-notch and hotly anticipated games hitting stores shelves is practically unprecedented, including BioShock, Crysis, Quake Wars, Unreal Tournament 3, and Valve's Orange Box trio of goodness. I can't remember a time quite like it.

However, this may not be the best time to own a dated graphics card. The latest generation of high-end graphics cards brought with it pretty much twice the performance of previous high-end cards, and to add insult to injury, these GPUs added DirectX 10-class features that today's games are starting to exploit. If you have last year's best, such as a GeForce 7900 or Radeon X1900, you may not be able to drink in all the eye candy of the latest games at reasonable frame rates.

And if you've played the Crysis demo, you're probably really ready to upgrade. I've never seen a prettier low-res slide show.

Fortunately, DirectX 10-class graphics power is getting a whole lot cheaper, starting today. Nvidia has cooked up a new spin of its GeForce 8 GPU architecture, and the first graphics card based on this chip sets a new standard for price and performance. Could the GeForce 8800 GT be the solution to your video card, er, Crysis? Let's have a look.

Meet the G92
In recent years, graphics processor transistor budgets have been ballooning at a rate even faster than Moore's Law, and that has led to some, um, exquisitely plus-sized chips. This fall's new crop of GPUs looks to be something of a corrective to that trend, and the G92 is a case in point. This chip is essentially a die shrink of the G80 graphics processor that powers incumbent GeForce 8800 graphics cards. The G92 adds some nice new capabilities, but doesn't double up on shader power or anything quite that earth-shaking.

Here's an extreme close-up of the G92, which may convince your boss/wife that you're reading something educational and technically edifying right about now. We've pictured it next to a U.S. quarter in order to further propagate the American hegemonic mindset. Er, I mean, to provide some context, size-wise. The G92 measures almost exactly 18 mm by 18 mm, or 324 mm². TSMC manufactures the chip for Nvidia on a 65nm fab process, which somewhat miraculously manages to shoehorn roughly 754 million transistors into this space. By way of comparison, the much larger G80—made on a 90nm process—had only 681 million transistors. AMD's R600 GPU packs 700 million transistors into a 420 mm² die area.

Why, you may be asking, does the G92 have so many more transistors than the G80? Good question. The answer is: a great many little additions here and there, including some we may not know about just yet.

One big change is the integration of the external display chip that acted as a helper to the G80. The G92 natively supports twin dual-link DVI outputs with HDCP, without the need for a separate display chip. That ought to make G92-based video cards cheaper and easier to make. Another change is the inclusion of the VP2 processing engine for high-definition video decoding and playback, an innovation first introduced in the G84 GPU behind the GeForce 8600 lineup. The VP2 engine can handle the most intensive portions of H.264 video decoding in hardware, offloading that burden from the CPU.

Both of those capabilities are pulled in from other chips, but here's a novel one: PCI Express 2.0 support. PCIe 2.0 effectively doubles the bandwidth available for communication between the graphics card and the rest of the system, and the G92 is Nvidia's first chip to support this standard. This may be the least-hyped graphics interface upgrade in years, in part because PCIe 1.1 offers quite a bit of bandwidth already. Still, PCIe 2.0 is a major evolutionary step, though I doubt it chews up too many additional transistors.

So where else do the G92's additional transistors come from? This is where things start to get a little hazy. You see, the GeForce 8800 GT doesn't look to be a "full" implementation of G92. Although this chip has the same basic GeForce 8-series architecture as its predecessors, the GeForce 8800 GT officially has 112 stream processors, or SPs. That's seven "clusters" of 16 SPs each. Chip designers don't tend to do things in odd numbers, so I'd wager an awful lot of Nvidia stock that the G92 actually has at least eight SP clusters onboard.

Eight's probably the limit, though, because the G92's SP clusters are "fatter" than the G80's; they incorporate the G84's more robust texture addressing capacity of eight addresses per clock, up from four in the G80. That means the GeForce 8800 GT, with its seven SP clusters, can sample a total of 56 texels per clock—well beyond the 24 of the 8800 GTS and 32 of the 8800 GTX. We'll look at the implications of this change in more detail in a sec.

Another area where the GeForce 8800 GT may be sporting a bit of trimmed down G92 functionality is in the ROP partitions. These sexy little units are responsible for turning fully processed and shaded fragments into full-blown pixels. They also provide much of the chip's antialiasing grunt, and in Nvidia's GeForce 8 architecture, each ROP has a 64-bit interface to video memory. The G80 packs six ROP partitions, which is why the full-blown GeForce 8800 GTX has a 384-bit path to memory and the sawed-off 8800 GTS (with five ROP partitions) has a 320-bit memory interface. We don't know how many ROP partitions the G92 has lurking inside, but the 8800 GT uses only four of them. As a result, it has a 256-bit memory interface, can output a maximum of 16 finished pixels per clock, and has somewhat less antialiasing grunt on a clock-for-clock basis.

How many ROPs does G92 really have? I dunno. I suspect we'll find out before too long, though.