You may have noticed that, in 2013, things have gotten distinctly weird in the world of the traditional personal computer. The PC is besieged on all sides by sliding sales, gloom-and-doom prophecies, and people making ridiculous claims about how folks would rather play shooters with a gamepad than a mouse and keyboard. There's no accounting for taste, I suppose, which perhaps explains that last one. Fortunately, there is accounting for other things, and the fact remains that PC gaming is a very big business. Yet the PC is in a weird place, with lots of uncertainty clouding its future, which probably helps explain why Nvidia has taken so long—nearly a year after the debut of the GeForce GTX 680—to wrap a graphics card around the biggest chip based on its Kepler architecture, the GK110.
The GK110 has been available for a while aboard Tesla cards aimed at the GPU computing market. This chip's most prominent mission to date has been powering the Titan supercomputer at Oak Ridge National Labs. The Titan facility alone soaked up 18,688 GK110 chips in Tesla trim, which can't have been cheap. Maybe that's why Nvidia has decided to name the consumer graphics card in honor of the supercomputer. Behold, the GeForce GTX Titan:
Of course, it doesn't hurt that "Titan" is conveniently outside of the usual numeric naming scheme for GeForce cards. Nvidia is free to replace any of the cards beneath this puppy in its product lineup without those pesky digits suggesting obsolescence. And let's be clear, practically anything else is Nvidia's lineup is certain to be below the Titan. This card is priced at $999.99, one deeply meaningful penny shy of a grand. As the biggest, baddest single-GPU solution on the block, the Titan commands a hefty premium. As with the GeForce GTX 690 before it, though, Nvidia has gone to some lengths to make the Titan look and feel worthy of its asking price. The result is something a little different from what we've seen in the past, and it makes us think perhaps this weird new era isn't so bad—provided you can afford to play in it.
GK110: one big chip
I won't spend too much time talking about the GK110 GPU, since we published an overview of the chip and architecture last year. Most of the basics of the chip are there, although Nvidia wasn't talking too specifically about the graphics resources onboard at that time. Fortunately, virtually all of the guesses we made back then about the chip's unit counts and such were correct. Here's how the GK110's basic specs match up to other recent GPUs:
Suffice to say that the GK110 is the largest, most capable graphics processor on the planet. At 551 mm², the chip's die size eclipses anything else we've seen in the past couple of years. You'd have to reach back to the GT200 chip in the GeForce GTX 280 in order to find its equal in terms of sheer die area. Of course, as a 28-nm chip, the GK110 packs in many more transistors than any GPU that has come before.
Here's a quick logical block diagram of the GK110, which I've strategically shrunk beyond the point of readability. Your optometrist will thank me later. Zoom in a little closer on one of those GPC clusters, and you'll see the chiclets that represent real functional units a little bit more clearly.
In most respects, this chip is just a scaled up version of the GK104 GPU that powers the middle of the GeForce GTX 600 lineup. The differences include the fact that each "GPC," or graphics processing cluster, includes three SMX engines rather than two. Also, there are five GPCs in total on the GK110, one more than on the GK104. Practically speaking, that means general shader processing power has been scaled up a little more aggressively than rasterization rates have been. We think that's easily the right choice, since performance in today's games tends to be bound by things other than triangle throughput.
Versus the GK104 silicon driving the GeForce GTX 680, the GK110 chip beneath the Titan's cooler has considerably more power on a clock-for-clock basis: 50% more pixel-pushing power, anti-aliasing grunt, and memory bandwidth; about 50% more texture filtering capacity; and not far from double the shader processing power. The GK104 has proven to be incredibly potent in today's games, but the GK110 brings more of just about everything that matters to the party.
The GK110 also brings something that has no real use for gaming: considerable support for double-precision floating-point math. Each SMX engine has 64 DP-capable ALUs, alongside 192 single-precision ALUs, so DP math happens at one-third the rate of SP. This feature is intended solely for the GPU computing market. Virtually nothing in real-time graphics or even consumer GPU computing really requires that kind of mathematical precision, so Nvidia's choice to leave this functionality intact on the Titan is an interesting one. It may also explain, in part, the Titan's formidable price, since Nvidia wouldn't wish to undercut its Tesla cards bearing the same silicon. Nevertheless, the Titan may prove attractive to some would-be GPU computing developers who like to play a little Battlefield 3 on the weekends.
Double-precision support on the Titan is a bit funky. One must enable it via the Nvidia control panel, and once it's turned on, the card operates at a somewhat lower clock frequency. Ours ran a graphics demo at about 15MHz below the Titan's base clock speed after we enabled double precision.
Oh, before we go on, I should mention that the GK110 chips aboard Titan cards will have one of their 15 SMX units disabled. On a big chip like this, disabling an area in order to improve yields is a very familiar practice. Let's put that into perspective using my favorite point of reference. The loss of the SMX adds up to about two Xbox 360s worth of processing power—192 ALUs and 16 texture units at nearly twice the clock speed of an Xbox. But don't worry; the GK110 has 14 more SMX units on hand.