Single page Print

Nvidia's GeForce GTX 960 graphics card reviewed


The green team brandishes a sawed-off shotgun
— 8:00 AM on January 22, 2015

If you haven't been around here for long, you may not be familiar with the world's smallest chainsaw. This crucial, mythic piece of equipment is what the smart folks at GPU companies use to cut down their chips to more affordable formats.

Say that you have a fairly robust graphics chip like the GM204 GPU that powers the GeForce GTX 980, with four Xbox Ones worth of shader processing power on tap. That's pretty good, right? But it's also expensive; the GTX 980 lists for $549. Say you want to make a more affordable product based on this same tech. That's when you grab the tiny starter pull cord between your thumb and index finger and give it an adorable little tug. The world's smallest chainsaw sputters to life. Saw the GM204 chip exactly in half, blow away the debris with a puff of air, and you get the GM206 GPU that powers Nvidia's latest graphics card, the GeForce GTX 960.

For just under two hundred bucks, the GTX 960 gives you half the power of a GeForce GTX 980—or, you know, two Xbox Ones worth of shader processing grunt. Better yet, because it's based on Nvidia's ultra-efficient Maxwell architecture, the GTX 960 ought to perform much better than its specs would seem to indicate. Can the GTX 960 live up to the standards set by its older siblings? Let's have a look.


There's lots o' GTX 960-based trouble brewing in Damage Labs

The sawed-off Maxwell: GM206
I suppose there are several things to be said at this juncture.

First, although "half a GTX 980" might not sound terribly sexy, I think this product is a pretty significant one for Nvidia. The GeForce GTX 970 and 980 have been a runaway success, so much so that the company uncharacteristically shared some sales numbers with us: over a million GTX 970 and 980 cards have been sold to consumers, a huge number in the realm of high-end graphics. What's more, we have reason to believe that estimate is already pretty dated. The current number could be nearly twice that.

But most people don't buy high-end graphics cards. Even among PC gamers, less expensive offerings are more popular. And the GTX 960 lands smack-dab in the "sweet spot" where most folks like to buy. If the prospect of "way more performance for about $200" sounds good to you, well, you're definitely not alone.

Also, there is no chainsaw. I probably made an awful lot of hard-working chip guys cringe with my massive oversimplification above. Although the GM206 really does have half of nearly all key graphics resources compared to the GM204, it's not just half the chip. These things aren't quite that modular—not that you'd know that from this block diagram, which looks for all the world like half a GM204.


A simplified block diagram of the GM206. Source: Nvidia.

The GM206 has two graphic processing clusters, almost complete GPUs unto themselves, with eight shader multiprocessor (SM) cores per cluster. Here's how the chip stacks up to other current GPUs.

ROP
pixels/
clock
Texels
filtered/
clock
(int/fp16)
Shader
processors
Rasterized
triangles/
clock
Memory
interface
width (bits)
Estimated
transistor
count
(Millions)
Die size
(mm²)
Fab
process
GK106 24 80/80 960 3 192 2540 214 28 nm
GK104 32 128/128 1536 4 256 3500 294 28 nm
GK110 48 240/240 2880 5 384 7100 551 28 nm
GM206 32 64/64 1024 2 128 2940 227 28 nm
GM204 64 128/128 2048 4 256 5200 398 28 nm
Pitcairn 32 80/40 1280 2 256 2800 212 28 nm
Tahiti 32 128/64 2048 2 384 4310 365 28 nm
Tonga 32 (48) 128/64 2048 4 256 (384) 5000 359 28 nm
Hawaii 64 176/88 2816 4 512 6200 438 28 nm

As you can see, the GM206 is a lightweight. The chip's area is only a little larger than the GK106 GPU that powers the GeForce GTX 660 and the Pitcairn chip from the Radeon R9 270X. Compared to those two, though, the GM206 has a narrower memory interface. In fact, the GM206 is the only chip in the table above with a memory path that narrow. Typically, GPUs of this size have wider interfaces.

The GM206 may be able to get away with less thanks to the Maxwell architecture's exceptional efficiency. Maxwell GPUs tend to like high memory frequencies, and the GTX 960 follows suit with a 7 GT/s transfer rate for its GDDR5 RAM. So there's more throughput on tap than one might think. Beyond that, this architecture makes very effective use of its memory bandwidth thanks to a new compression scheme that can, according to Nvidia's architects, reduce memory bandwidth use between 17% and 29% in common workloads based on popular games.

Interestingly, Nvidia identifies the Radeon R9 285 as the GTX 960's primary competitor. The R9 285 is based on a much larger GPU named Tonga, which is the only new graphics chip AMD introduced in 2014. (The R9 285 ships with a 256-bit memory interface, although I still believe that the Tonga chip itself probably is capable of a 384-bit memory config. For whatever reason—perhaps lots of inventory of the existing Hawaii and Tahiti chips—AMD has chosen not to ship a card with a fully-enabled Tonga onboard.) Even with several bits disabled, the R9 285 has a much wider memory path and more resources of nearly every kind at its disposal than the GM206 does. If the GTX 960's performance really is competitive with the R9 285, it will be a minor miracle of architectural efficiency.