Single page Print

Nvidia's GeForce 9600 GT graphics processor


The mighty mite aims for the Radeon HD 3800 series
— 12:09 AM on February 26, 2008

For a while there, trying to find a decent DirectX 10-capable graphics card for somewhere around two hundred bucks was a tough assignment. Nvidia had its GeForce 8600 GTS, but that card didn't really perform well enough to measure up similarly priced DX9 cards. On the Radeon side of things, AMD had, well, pretty much nothing. You could buy a cheap, slow DX10-ready Radeon or a faster one with a formidable price tag. Between them, crickets chirped as tumbleweeds blew by.

Happily, the GPU makers saw fit to remedy this situation, and in the past few months, we've gained an embarrassment of riches in video card choices between about $170 and $250, including the screaming GeForce 8800 GT and a pair of solid values in the Radeon HD 3850 and 3870. Now, that embarrassment is becoming positively scandalous, as Nvidia unveils yet another new GPU aimed at graphics cards below the $200 mark: the GeForce 9600 GT.

Where does the 9600 GT fit into the daunting mix of video cards available in this price range? How does it match up with the Radeon HD 3850 and 3870? Why is this new GPU the first installment in the GeForce 9 series? We have no idea about that last one, but we'll try to answer those other questions.


The GeForce 9600 GT laid bare

Welcome the new middle management
Let's get this out of the way at the outset. Nvidia's decision to make this new graphics card the first in the GeForce 9 series is all kinds of baffling. They just spent the past few months introducing two new members of the 8-series, the GeForce 8800 GT and the confusingly named GeForce 8800 GTS 512, based on a brand-new chip codenamed G92. The G92 packs a number of enhancements over older GeForce 8 graphics processors, including some 3D performance tweaks and improved HD video features. Now we have another new GPU, codenamed G94, that's based on the same exact generation of technology and is fundamentally similar to the G92 in almost every way. The main difference between the two chips is that Nvidia has given the G94 half the number of stream processor (SP) units in the G92 in order to create a smaller, cheaper chip. Beyond that, they're pretty much the same thing.

So why the new name? Nvidia contends it's because the first product based on the G94, the GeForce 9600 GT, represents such a big performance leap over the prior-generation GeForce 8600 GTS. I suppose that may be true, but they're probably going to have to rename the GeForce 8800 GT and GTS 512 in order to make their product lineup rational again. For now, you'll just want to keep in mind that when you're thinking about the GeForce 8800 GT and the 9600 GT, you're talking about products based on two chips from the same generation of technology, the G92 and G94. They share the same feature set, so choosing between them ought to be a simple matter of comparing price and performance, regardless of what the blue-shirts haunting the aisles of Best Buy tell you.

Not that we really care about that stuff, mind you. We're much more interested in the price and performance end of things, and here, the G94 GPU looks mightily promising. Because Nvidia has only excised a couple of the SP clusters included in the G92, the G94 retains most of the bits and pieces it needs to perform quite well, including a 256-bit memory interface and a full complement of 16 ROP units to output pixels and handle antialiasing blends. Yes, the G94 is down a little bit in terms of shader processing power and (since texture units are located in the SPs) texture filtering throughput. But you may recall that the GeForce 8800 GT is based on a G92 with one of its eight SP clusters disabled, and it works quite well indeed.

Here's a quick look the G94's basic capabilities compared to some common points of reference.

ROP output:
Pixels/clock
Texture
filtering:
Bilinear
texels/clock
Texture
filtering:
Bilinear FP16
texels/clock
Stream
processors
Memory
interface
bits
Radeon HD 38x0 16 16 16 320 256
GeForce 9600 GT 16 32 16 64 256
GeForce 8800 GT 16 56 28 112 256
GeForce 8800 GTS 20 24 24 96 320

The 9600 GT is suitably potent to match up well in most categories with the GeForce 8800 GT and the Radeon HD 3850/3870. Even the older G80-based GeForce 8800 GTS fits into the conversation, although its capacities are almost all higher. As you know, the RV670 GPU in the Radeons has quite a few more stream processors, but Nvidia's GPUs tend to make up that difference with higher SP clock speeds.

In fact, the GeForce 9600 GT makes up quite a bit of ground thanks to its clock speeds. The 9600 GT's official "base" clock speeds are 650MHz for the GPU core, 1625MHz for the stream processors, and 900MHz (1800MHz effective) for its GDDR3 memory. From there, figuring out the GPU's theoretical potency is easy.

Peak
pixel
fill rate
(Gpixels/s)
Peak bilinear
texel
filtering
rate
(Gtexels/s)
Peak bilinear
FP16 texel
filtering
rate
(Gtexels/s)
Peak
shader
arithmetic
(GFLOPS)
Peak
memory
bandwidth
(GB/s)
Radeon HD 3870 12.4 12.4 12.4 496 72.0
GeForce 9600 GT 10.4 20.8 10.4 312 57.6
GeForce 8800 GT 9.6 33.6 16.8 504 57.6
GeForce 8800 GTS 10.0 12.0 12.0 346 64.0

As expected, the 9600 GT trails the 8800 GT in terms of texture filtering capacity and shader processing power, but it has just as much pixel fill rate and memory bandwidth as its big brother. More notably, look at how the 9600 GT matches up to the GeForce 8800 GTS, a card that was selling for $400 less than a year ago.

Making these theoretical comparisons to entirely different GPU architectures like the RV670 is rather tricky. On paper, the 9600 GT looks overmatched versus the Radeon HD 3870, even though we've given the GeForce cards the benefit of the doubt here in terms of our FLOPS estimates. (Another way of counting would cut the GeForces' FLOPS count by a third.) We'll have to see how that works out in practice.

Incidentally, the 9600 GT's performance will be helped at higher resolutions by a feature carried over from the G92: improved color compression. All GeForce 8-series GPUs compress color data for textures and render targets in their ROP subsystems in order to save bandwidth. The G92 and G94 have expanded compression coverage, which Nvidia says is now sufficient for running games at resolutions up at 2560x1600 with 4X antialiasing.

The chip
Like the G92 before it, the G94 GPU is manufactured on a 65nm fabrication process. That leaves AMD with something of an edge, since the RV670 is made using a smaller 55nm process. Nvidia estimates the G94's transistor count at 505 million, versus 754 million for the G92. AMD seems to count a little differently, but it estimates the RV670 at a sinister 666 million transistors.




Here's a quick visual comparison of the three chips. By my measurements, the G94 is approximately 240 mm², quite a bit smaller than the G92 at 324 mm² but not as small as the RV670 at 192 mm². Obviously, the G94 is very much in the same class as the RV670, and it should give Nvidia a much more direct competitor to AMD's strongest product.