Single page Print

Nvidia's GeForce GTX Titan X graphics card reviewed


Your GTX 980 is puny. I spit on it. Ptoo.
— 2:00 PM on March 17, 2015

Some things don't require a ton of introduction. Like a second Avengers movie or a new Corvette. You pretty much know the deal going in, and the rest is details. When Nvidia CEO Jen-Hsun Huang pulled out a Titan X graphics card onstage at GDC, most folks in the room probably knew the deal right away. The Titan X would be the world's fastest single-GPU graphics card. The rest? Details.


Happily, it's time to fill in a bunch of those details, since the Titan X is officially making its debut today. You may have already heard that it's based on a brand-new graphics processor, the plus-sized GM200, which packs pretty much 50% more of everything than the GPU inside the GeForce GTX 980. Perhaps you've begun to absorb the fact that the Titan X ships with 12GB of video memory, which is enough to supply the main memory partitions on six GeForce GTX 960 cards—or 3.42857 GeForce GTX 970s. (Nerdiest joke you'll hear all week, folks. Go ahead and subscribe.)

These details paint a bigger picture whose outlines are already obvious. Let's fill in a few more and then see exactly how the Titan X performs.

The truly big Maxwell: GM200

The GM200 GPU aboard the Titan X is based on the same Maxwell architecture that we've seen in lower-end GeForce cards, and in many ways, it follows the template set by Nvidia in the past. The big version of the new GPU architecture often arrives a little later, but when it does, good things happen on much larger scale.

The GM200—and by extension, the Titan X—differs from past full-sized Nvidia GPUs in one key respect, though. This chip is made almost purely for gaming and graphics; its support for double-precision floating-point math is severely limited. Double-precision calculations happen at only 1/32nd of the single-precision rate.

For gamers, that's welcome news. I'm happy to see Nvidia committing the resources to build a big chip whose primary mission in life is graphics, and this choice means the GM200 can pack in more resources to help pump out the eye candy. But for those folks wanting to do GPU computing with the Titan X, this news may be rather disappointing. The Titan X will likely still be quite potent for applications that require only single-precision datatypes, but the scope of possible applications will be limited. Folks wanting to do some types of work will have to look elsewhere.

Speaking of geeky details about chips, here's an intimidating table:

ROP
pixels/
clock
Texels
filtered/
clock
(int/fp16)
Shader
processors
Rasterized
triangles/
clock
Memory
interface
width (bits)
Estimated
transistor
count
(Millions)
Die size
(mm²)
Fab
process
GK110 48 240/240 2880 5 384 7100 551 28 nm
GM204 64 128/128 2048 4 256 5200 398 28 nm
GM200 96 192/192 3072 6 384 8000 601 28 nm
Tahiti 32 128/64 2048 2 384 4310 365 28 nm
Tonga 32 (48) 128/64 2048 4 256 (384) 5000 359 28 nm
Hawaii 64 176/88 2816 4 512 6200 438 28 nm

Like every other chip in the Maxwell family and the ones in the Kepler generation before it, the GM200 is manufactured on a 28-nm fab process. Compared to the GK110 chip that powers older Titans, the GM200 is about 50 square millimeters larger and crams in an additional billion or so transistors.


A simplified block diagram of the GM200. Source: Nvidia.

The block diagram above may be too small to read in any detail, but it will look familiar if you've read our past coverage of the Maxwell architecture in our GeForce GTX 750 Ti and GTX 980 reviews. What you see above signifies 50% more of almost everything compared to the GM204 chip that drives the GTX 980. The GM200 has six graphics processing clusters, or GPCs, which are nearly complete GPUs unto themselves. In total, it has 24 shader multiprocessors, or SMs, each of which has quad "cores." (Nvidia calls them quads and calls ALU slots cores, but that's just marketing inflation. Somehow 96 "cores" wasn't impressive enough.) Across the whole chip, the GM200 has a grand total of 3072 shader ALU slots, which we've reluctantly agreed to call shader processors.

Compared to the GK110 chip before it, the GM200 has a somewhat different mix of resources. The big Maxwell has twice as many ROPs, which should give it substantially higher pixel throughput and more capacity for the blending work needed for multisampled antialiasing (MSAA). The GM200 also has a few more shader ALU slots and can rasterize one additional triangle per clock cycle. Notably, though, the new Maxwell's texture filtering capacity is a little lower than its predecessor's.

These changes aren't anything too shocking given what we've seen from other Maxwell-based GPUs. Thing is, the Maxwell architecture includes a bunch of provisions to make sure it takes better advantage of its resources, and that's where the real magic is. For instance, the chips' L2 cache sizes aren't shown in the table above, but they probably should be. The GM200's cache is 3MB, double the size of the GK110's. The added caching may help make up for the deficit in raw texture filtering rates. Also, Maxwell-based chips have a simpler SM structure with more predictable instruction scheduling. That revised arrangement can potentially keep the shader ALUs more consistently occupied. And Maxwell chips can better compress frame buffer data, which means the GM200 should extract effectively more bandwidth from its memory interface, even though it has the same 384-bit width as the GK110's.

In fact, I'm pretty sure there's at least one significant new feature built into the Maxwell architecture that Nvidia isn't telling us about. Maxwell-based GPUs are awfully efficient compared to their Kepler forebears, and I don't think we know entirely why. We'll have to defer that discussion for another time.

Nvidia Titans the screws

GPU
base
clock
(MHz)
GPU
boost
clock
(MHz)
ROP
pixels/
clock
Texels
filtered/
clock
Shader
pro-
cessors
Memory
path
(bits)
GDDR5
transfer
rate
Memory
size
Peak
power
draw
Intro
price
GTX 960 1126 1178 32 64 1024 128 7 GT/s 2 GB 120W $199
GTX 970 1050 1178 56 104 1664 224+32 7 GT/s 3.5+0.5GB 145W $329
GTX 980 1126 1216 64 128 2048 256 7 GT/s 4 GB 165W $549
Titan X 1002 1076 96 192 3072 384 7 GT/s 12 GB 250W $999

The Titan X is by far the most potent member of Nvidia's revamped GeForce lineup. The GM200 GPU has a base clock of about 1GHz, a little lower than the speeds you'll see on the GTX 980. The slower clocks are kind of expected from a bigger chip, but the Titan X more than makes up for it by having more of everything else—including a ridiculous 12GB of GDDR5 memory. I don't think anybody technically needs that much video RAM just yet, but I'm sure Nvidia is happy to sell it at the Titan's lofty sticker price.

Heck, to my frugal Midwestern mind, the most exciting thing about the Titan is the fact that it portends the release of a slightly cut-down card based on the GM200, likely with 6GB of VRAM, for less money.

Nvidia has equipped the Titan X with its familiar dual-slot aluminum cooler, but this version has been coated with a spiffy matte-black finish. The result is a look similar to a blacked-out muscle car, and I think it's absolutely bad-ass. Don't tell the nerds who read my website that I got so excited about paint colors, though, please. Thanks.

Many of the mid-range cards floating around in Damage Labs these days have larger coolers than the Titan X, so it's kind of impressive what Nvidia has been able to do in a reasonable form factor. The Titan X requires two aux power inputs, one six-pin and one eight-pin, and it draws a peak of 250W total. Nvidia recommends a 600W PSU in order to drive it.

We could revel in even more of the Titan X's details, but I think you've got the picture by now. Let's see how it handles.