Pascal is here. After a long, long stop at the 28-nm process node, Nvidia is storming into a new era of graphics performance with a freshly-minted graphics architecture manifested on TSMC's 16-nm FinFET process. Forget the modest introduction of Maxwell on board the GeForce GTX 750 Ti. The first consumer Pascal graphics card—the GeForce GTX 1080—is a high-performance monster that's so fast, it's practically warping the high-end graphics card market in its wake.
We can say all of this now because we've seen the numbers other reviewers have generated over the past few weeks, just as you have. Our goal today isn't to produce a bunch of average frame rate results and crown the card a winner, though—we already know that the GTX 1080 is the fastest single-GPU graphics card around by that measure. Instead, we'll be using our frame-time benchmarking methods to characterize just how smooth a gaming experience the GTX 1080 delivers along with its world-beating speed.
First, though, let's discuss the improvements that Nvidia made under the hood with Pascal to deliver the kinds of performance we'll be seeing from our test bench. You should check out our Pascal architecture deep-dive to get a broad idea of where Nvidia is coming from with its latest generation of products if you haven't already—we won't be revisiting all of the information presented there in this review.
A new GPU: GP104
Nvidia isn't using its largest Pascal chip to power the GTX 1080. That GP100 GPU is only available as part of a Tesla P100 card for high-performance computing systems right now, and for good reason. That enormous chip (610 mm2!) is full of double-precision hardware that's not of use to most gamers, and Nvidia is apparently having no problem selling every one of those chips it can make as parts of systems for large businesses with a need for double-precision speed.
The GTX 1080 (and its GTX 1070 sibling) are both powered by a smaller chip called GP104. This 314-mm2 chip has a smaller surface area than GM204 before it, but thanks to the wonders of Moore's Law, we get more power in that tinier space. The fully-enabled GP104 chip in the GTX 1080 has 20 Pascal SMs for a total of 2560 stream processors, up 20% from GM204's 2048, and about 16% fewer than the 3072 in the fully-enabled GM200 on the Titan X.
Nvidia has also bumped GP104's texturing capabilities a bit. This chip has 160 texture units, up from 128 in GM204. Its complement of 64 ROPs is the same as the middleweight Maxwell's, though, and that ROP count is still down on the 96 of the GM200 chip in the Titan X and the GTX 980 Ti.
|GM204||64||128/128||2048||4||256||5200||416 (398)||28 nm|
What's most eye-popping about GP104 isn't its resource allocations, impressive though they might be. It's the chip's clock speeds. The reference GTX 1080 runs at bonkers 1607MHz base and 1733MHz boost speeds. Recall that the GM204 chip in the GTX 980 ran at 1126MHz base and 1216MHz boost clocks in its reference design. Nvidia has also demonstrated considerable overclocking headroom on GP104. The company showed off a card running at 2.1GHz—on air, no less—during its Dreamhack keynote.
That clock jump is partially thanks to the move to the 16-nm FinFET process, but Nvidia says its engineers worked hard on boosting clock speeds in the chip's design process, too. The company says the finished product's clock speed boost is "well above" what the process shrink alone would have produced.
In general, a move to a smaller process gives chip designers the ability to extract the same performance from a device that consumes less power, or to get more performance from the same power budget. Given the choice, it's not surprising that Nvidia's engineers appear to be pushing the performance envelope this time around. The GTX 1080's 180W board power has crept up a bit from the GTX 980's 165W figure, but it's still frugal enough that the green team only needed to put a single eight-pin PCIe power connector on the card. We've long praised the company's Maxwell cards for their efficiency, so we'll forgive the GTX 1080 its slightly higher power requirements on paper.
New memory, too: GDDR5X
While the Tesla P100 is packaged with 16GB of HBM2 RAM, Nvidia uses GDDR5X RAM on the GTX 1080. GDDR5X is an evolution of the GDDR5 standard we know and love. GDDR5X achieves higher transfer rates per pin (10 to 14 GT/s) than GDDR5. Nvidia runs these chips at 10 GT/s and pairs them with a 256-bit memory bus. That's good for a theoretical 320 GB/s of bandwidth. On first glance, one might think that's a major improvement over the GTX 980's 224 GB/s rate, but a bit short of the GeForce GTX 980 Ti's 336 GB/s and well behind the Radeon R9 Fury X's 512 GB/s.
Raw transfer rates don't tell the whole story in Pascal, though. This new architecture has a souped-up version of the delta-color-compression techniques that we've seen adopted across the industry. Pascal can apply its 2:1 compression more often, and it includes two new compression modes. Nvidia says the chip can employ a new 4:1 compression mode in cases where per-pixel deltas are "very small," and an 8:1 compression mode "combines 4:1 constant color compression of 2x2 pixel blocks with 2:1 compression of the deltas between those blocks."
The net result of that compression cleverness is that Pascal can squeeze down more of the color information in a frame than Maxwell GPUs could. That lets the card hold more data in its caches, reduce the number of trips out to its onboard memory, and reduce the size of data transferred across the chip. Nvidia says these improvements are good for a roughly 20% increase in "effective bandwidth" above and beyond the move to GDDR5X alone.