When Nvidia unleashed the GeForce GTX Titan a few months ago, that card's combination of world-beating performance and an eye-popping $1K price tag pretty much immediately ignited speculation about what would come next. Surely a somewhat trimmed-down version of this same GPU would be used to power a less expensive product before long, right?
Well, yep. We're gathered here today to say hello to the GeForce GTX 780, which is a GeForce Titan that's gotten an incredibly close haircut. The result is a card that closely resembles the Titan, only with a few hundred bucks off the top. The 780's performance is a little lower, but only by a couple of Xbox 360s—in other words, not enough that anybody is likely to notice, given the sheer scale of the remaining graphics processing power.
The GTX 780: Not quite Titanic
Like the Titan, the GeForce GTX 780 is based on the GK110 graphics chip, the big daddy of Nvidia's Kepler lineup. To understand the 780's relationship to the Titan, let's pull up a functional block diagram of the GK110 GPU. This diagram is grossly oversimplified, yet we've had to shrink it down to nearly unreadable size in order to fit it on the page. These things happen when you're dealing with, you know, the most complex consumer semiconductor product in history. With 7.1 billion transistors, the GK110 is beefier than a Five Layer Burrito.
The squint-inducing diagram above shows the GK110's five graphics processing clusters, or GPCs. Each GPC is virtually a GPU unto itself, with its own rasterizer engine and three separate shader multiprocessing engines, or SMX units. Each SMX then has 192 shader ALUs—often called "shader cores" by marketing types who may or may not know better—and 16 texture management units. Scale all of these resources up across five GPCs, and you have a massive pool of graphics processing resources.
Trouble is, you also have a really huge chip where a single flaw or weakness could scuttle the whole thing. To manage that problem, chipmakers will disable portions of a chip that aren't quite perfect. Aboard the Titan, the GK110 has one of its SMX units disabled. In the GTX 780, three of the SMX units have been shut down.
Interestingly enough, that change means different things in different cases. Some GTX 780 cards will have all three SMX units in a single GPC disabled, so the entire GPC goes dark. In that case, the card will have four raster engines, so its peak rasterization rate will be four triangles per clock. Other 780 cards may have their disabled SMX units spread around, so all five GPCs and raster engines remain active. Which configuration you get is presumably the luck of the draw. I'd get spun up about the potential disparity, but I don't think the 780's rasterization rates are likely to limit its gaming performance any time soon.
Speaking of things that don't matter much, Nvidia has decided to scale back the GTX 780's capacity for double-precision floating-point math. Double-precision support is built into the GK110 GPU because of the chip's compute-focused role aboard Nvidia's Tesla products. Real-time graphics basically don't require that level of precision. The Titan offers the GK110's full DP performance, so it can be used for scientific computing and other non-graphics compute applications. On the GTX 780, DP math executes at 1/24th the rate of single-precision math, just enough to maintain compatibility without truly being useful.
|GeForce GTX 580||772||-||512||64||48||4 GT/s||384||244W|
|GeForce GTX 680||1006||1058||1536||128||32||6 GT/s||256||195W|
|GeForce GTX 780||863||900||2304||192||48||6 GT/s||384||250W|
|GeForce GTX Titan||836||876||2688||224||48||6 GT/s||384||250W|
|GeForce GTX 690||915||1019||3072||256||64||6 GT/s||2 x 256||300W|
Outside of the GPCs, the GK110 chip on the GTX 780 isn't hobbled at all. All six of its memory controllers and ROP partitions are active, as is its full 1536KB of L2 cache. The GTX 780 has a 384-bit aggregate path to memory and 48 pixels per clock of ROP throughput, just like the Titan. Even the 6Gbps memory transfer rate is the same, although the GTX 780 has 3GB of GDDR5 memory, not the outsized 6GB memory capacity of the Titan.
In fact, the 780's clock frequencies are a little more aggressive than the Titan's, with an 863MHz base and a 900MHz Boost clock. (Nvidia says the Boost clock should be the typical operating speed while gaming.) By contrast, the Titan's base and boost clocks are 836 and 867MHz, respectively.
|GeForce GTX 580||37||49||49||1.6||3.1||192|
|GeForce GTX 680||34||135||135||3.3||4.2||192|
|GeForce GTX 780||43||173||173||4.2||4.5||288|
|GeForce GTX Titan||42||196||196||4.7||4.4||288|
|GeForce GTX 690||65||261||261||6.5||8.2||385|
|Radeon HD 7970 GHz||34||134||67||4.3||2.1||288|
|Radeon HD 7990||64||256||128||8.2||4.0||576|
The result of all of this fine-grained tuning is evident in the table above. The GTX 780 has lower peak texture filtering and shader arithmetic rates than the Titan, but its ROP and rasterization rates are potentially higher than the Titan's by just a smidgen. The two cards' memory bandwidth specs are equivalent.
The 780's status as a not-quite-Titan puts it a notch or two above the GeForce GTX 680 in almost every key rate. This new card has AMD's fastest single-GPU card, the Radeon HD 7970, outgunned in every category except two: memory bandwidth and shader arithmetic, where the two are neck and neck. If you step back another generation and compare to the GeForce GTX 580, the contrasts are much starker. The GTX 780 has about three and a half times the texture filtering capacity of the GTX 580 and offers smaller-but-still-noteworthy gains in every other category.
The not-quite-Titan theme carries over into the physical appearance of the GTX 780. The two cards are practically identical, save for the extra 3GB worth of memory chips on the back of the Titan and the different names etched into their aluminum-and-magnesium cooling shrouds. That's a good thing, since we are big, er, fans of the Titan's cooler. Not only does it perform well, but the premium materials also lend it a touch of class that the usual shiny plastic shrouds can't match.
Like the Titan, the GTX 780 is 10.5" long—same length as a Radeon HD 7970—and requires 6-pin and 8-pin aux power inputs. The output complement is the same, as well, with two dual-link DVI ports, an HDMI output, and a full-sized DisplayPort 1.2 connector.
One new wrinkle Nvidia has added to the GTX 780 is a revised fan speed control algorithm. This new routine attempts to limit the amount of fluctuation in fan speeds over time. That should reduce the number of pitch changes coming from the card's blower, making the noise it produces less noticeable.
Now for the scandalous bit. The GeForce GTX 780 should be available at online retailers starting today for $649.99. That's 350 bones less than the Titan, for a card that's just a slightly de-tuned variant with 3GB of memory. If you just recently paid a grand for a Titan, well, my condolences. From here out, I suspect the Titan's appeal will be very much limited by the appearance of the GTX 780.
|Samsung's 28'' display serves up single-tile 4K at 60Hz for $800||115|
|Good Friday Shortbread||38|
|Friday night topic: where are the good ultraportables?||81|
|Deal of the week: Radeon R9 290X cards for... more than list?||19|
|Release roundup: Bits, pieces, and whole PCs||29|
|AMD posts another loss but beats Wall Street forecast||65|
|GlobalFoundries licenses Samsung process tech, grants AMD access to FinFETs||108|
|MSI shows next-gen Intel motherboards||47|