Pascal is here. After a long, long stop at the 28-nm process node, Nvidia is storming into a new era of graphics performance with a freshly-minted graphics architecture manifested on TSMC’s 16-nm FinFET process. Forget the modest introduction of Maxwell on board the GeForce GTX 750 Ti. The first consumer Pascal graphics card—the GeForce GTX 1080—is a high-performance monster that’s so fast, it’s practically warping the high-end graphics card market in its wake.
We can say all of this now because we’ve seen the numbers other reviewers have generated over the past few weeks, just as you have. Our goal today isn’t to produce a bunch of average frame rate results and crown the card a winner, though—we already know that the GTX 1080 is the fastest single-GPU graphics card around by that measure. Instead, we’ll be using our frame-time benchmarking methods to characterize just how smooth a gaming experience the GTX 1080 delivers along with its world-beating speed.
First, though, let’s discuss the improvements that Nvidia made under the hood with Pascal to deliver the kinds of performance we’ll be seeing from our test bench. You should check out our Pascal architecture deep-dive to get a broad idea of where Nvidia is coming from with its latest generation of products if you haven’t already—we won’t be revisiting all of the information presented there in this review.
A new GPU: GP104
Nvidia isn’t using its largest Pascal chip to power the GTX 1080. That GP100 GPU is only available as part of a Tesla P100 card for high-performance computing systems right now, and for good reason. That enormous chip (610 mm2!) is full of double-precision hardware that’s not of use to most gamers, and Nvidia is apparently having no problem selling every one of those chips it can make as parts of systems for large businesses with a need for double-precision speed.
The GTX 1080 (and its GTX 1070 sibling) are both powered by a smaller chip called GP104. This 314-mm2 chip has a smaller surface area than GM204 before it, but thanks to the wonders of Moore’s Law, we get more power in that tinier space. The fully-enabled GP104 chip in the GTX 1080 has 20 Pascal SMs for a total of 2560 stream processors, up 20% from GM204’s 2048, and about 16% fewer than the 3072 in the fully-enabled GM200 on the Titan X.
A block diagram of the GP104 GPU. Source: Nvidia
Nvidia has also bumped GP104’s texturing capabilities a bit. This chip has 160 texture units, up from 128 in GM204. Its complement of 64 ROPs is the same as the middleweight Maxwell’s, though, and that ROP count is still down on the 96 of the GM200 chip in the Titan X and the GTX 980 Ti.
|GM204||64||128/128||2048||4||256||5200||416 (398)||28 nm|
What’s most eye-popping about GP104 isn’t its resource allocations, impressive though they might be. It’s the chip’s clock speeds. The reference GTX 1080 runs at bonkers 1607MHz base and 1733MHz boost speeds. Recall that the GM204 chip in the GTX 980 ran at 1126MHz base and 1216MHz boost clocks in its reference design. Nvidia has also demonstrated considerable overclocking headroom on GP104. The company showed off a card running at 2.1GHz—on air, no less—during its Dreamhack keynote.
That clock jump is partially thanks to the move to the 16-nm FinFET process, but Nvidia says its engineers worked hard on boosting clock speeds in the chip’s design process, too. The company says the finished product’s clock speed boost is “well above” what the process shrink alone would have produced.
In general, a move to a smaller process gives chip designers the ability to extract the same performance from a device that consumes less power, or to get more performance from the same power budget. Given the choice, it’s not surprising that Nvidia’s engineers appear to be pushing the performance envelope this time around. The GTX 1080’s 180W board power has crept up a bit from the GTX 980’s 165W figure, but it’s still frugal enough that the green team only needed to put a single eight-pin PCIe power connector on the card. We’ve long praised the company’s Maxwell cards for their efficiency, so we’ll forgive the GTX 1080 its slightly higher power requirements on paper.
New memory, too: GDDR5X
While the Tesla P100 is packaged with 16GB of HBM2 RAM, Nvidia uses GDDR5X RAM on the GTX 1080. GDDR5X is an evolution of the GDDR5 standard we know and love. GDDR5X achieves higher transfer rates per pin (10 to 14 GT/s) than GDDR5. Nvidia runs these chips at 10 GT/s and pairs them with a 256-bit memory bus. That’s good for a theoretical 320 GB/s of bandwidth. On first glance, one might think that’s a major improvement over the GTX 980’s 224 GB/s rate, but a bit short of the GeForce GTX 980 Ti’s 336 GB/s and well behind the Radeon R9 Fury X’s 512 GB/s.
Raw transfer rates don’t tell the whole story in Pascal, though. This new architecture has a souped-up version of the delta-color-compression techniques that we’ve seen adopted across the industry. Pascal can apply its 2:1 compression more often, and it includes two new compression modes. Nvidia says the chip can employ a new 4:1 compression mode in cases where per-pixel deltas are “very small,” and an 8:1 compression mode “combines 4:1 constant color compression of 2×2 pixel blocks with 2:1 compression of the deltas between those blocks.”
An example of Pascal’s color compression in action. Pink portions of the frame are compressed. Source: Nvidia
The net result of that compression cleverness is that Pascal can squeeze down more of the color information in a frame than Maxwell GPUs could. That lets the card hold more data in its caches, reduce the number of trips out to its onboard memory, and reduce the size of data transferred across the chip. Nvidia says these improvements are good for a roughly 20% increase in “effective bandwidth” above and beyond the move to GDDR5X alone.
Pascal architectural improvements
Anybody attuned to the enthusiast hardware scene over the past few months has doubtless heard a ton about graphics cards’ asynchronous compute capabilities, namely Radeons’ prowess and GeForces’ apparent shortcomings on that point. However much stock you place in this argument, Pascal appears to offer improved asynchronous compute capability versus Maxwell chips.
First, we should talk a little bit about the characteristics of an asynchronous compute workload. Nvidia suggests that an asynchronous task might overlap with another task running on the GPU at the same time, or it might need to interrupt a task that’s running in order to complete within a given time window.
One example of such a compute task is asynchronous timewarp, a VR rendering method that uses head-position data to slightly reproject a frame before sending it out to the VR headset. Nvidia notes that timewarp often needs to interrupt—or preempt—a task in progress to execute on time. On the other hand, less time-critical workloads, like physics or audio calculations, might run concurrently (but asynchronously) with rendering tasks. Nvidia says Pascal chips support two major forms of asynchronous compute execution: dynamic load-balancing for overlapping workloads, and pixel-level preemption for time-sensitive ones.
It’s here that we actually learn a thing or two about what Maxwell could do in this regard—perhaps even in more depth than we ever did while those chips were the hottest thing on the market. Nvidia says Maxwell provided overlapping workloads with a static partitioning of resources: one partition for graphics tasks, and another for compute. The company says this approach was effective when the partitioning scheme matched the resources needed by both graphics and compute workloads. Maxwell’s static partitioning has a downside, though: mess up that initial resource allocation, and a graphics task can complete before a compute task, causing part of the GPU to go idle while it waits for the compute task to complete and for new work to be dispatched.
It might seem obvious to say so, but like any modern chip, GPUs want all of their pipelines filled as much of the time as possible in order to extract maximum performance. Idle resources are bad news. Nvidia admits as much in its documentation, noting that a long-running task in one resource partition might cause performance for the concurrent tasks to fall below whatever the potential benefits of running them together might have offered. Either way, if you were wondering what exactly was going on with Maxwell and async compute way back when, it appears this is your answer.
Pascal looks like it’s much better provisioned to handle asynchronous workloads. For overlapping tasks, the chip can now perform what Nvidia calls dynamic load balancing. Unlike the rather coarse-sounding partitioning method outlined above, Pascal chips can dispatch work to idle parts of the GPU on the fly, potentially keeping more of the chip at work and improving performance.
Nvidia doesn’t go into the same depth about Maxwell’s pre-emption capability as it does for the architecture’s methods for handling overlapping workloads, but given friend-of-TR David Kanter‘s now-infamous comment about preemption on Maxwell being “potentially catastrophic,” perhaps we can guess why. Pascal’s preemption abilities seem to be much better, though. Let’s talk about them.
For one, Nvidia claims Pascal is the first GPU architecture to implement preemption at the pixel level. The company says each of the chip’s graphics units can keep track of its intermediate state on a work unit. That fine-grained awareness lets those resources quickly save state, service the preemption request, and pick up work where they left off once the high-priority task is complete. Once the GPU is finished with the work that it can’t save and unload, Nvidia says that task-switching with preemption can finish in under 100 microseconds. Compute tasks also benefit from the finer-grained preemption capabilities of Pascal cards. If a CUDA workload needs to preempt another running compute task, that interruption can occur at the instruction level.
Simultaneous multi-projection, single-pass stereo, and VR
One of the biggest architectural changes in Pascal is a new component in the Polymorph Engine geometry processor that arrived in Fermi GPUs. That processor now benefits from a feature called the Simultaneous Multi-Projection Engine, or SMPE. This hardware can take geometry information from the upstream graphics pipeline and create up to 16 separate pre-configured projections of a scene across up to two different camera positions. This hardware efficiently performs a task that would have previously required generating geometry for as many separate projections as a developer wanted to create—a prohibitively performance-intensive task.
All that jargon essentially means that in situations where a single projection might have caused weird-looking perspective errors, like one might see with a three-monitor surround setup, Pascal can now account for the angle of those displays (with help from the application programmer) and create the illusion of a continuous space across all three monitors with no perspective problems.
Surround gaming is just one application for this technology, though—it also has major implications for VR performance. You’ll remember that the SMPE can create projections based on up to two different camera positions. Humans have two eyes, and if we put on a VR headset, we end up looking at two different screens with slightly different views of a scene. Before Pascal hit the market, Nvidia says graphics cards had to render for each eye’s viewpoint separately, resulting in twice as much work.
An example of how the same scene needs to look for different eyes in VR. Source: Nvidia
With Pascal, however, SMPE enables a new capability called Single-Pass Stereo rendering for VR headsets. As Nvidia puts it, Single-Pass Stereo lets an application submit its vertex work just once. The graphics card will then produce two positions for each vertex and match up each one with the correct eye. This resource essentially cuts the work necessary to render for a VR headset in half, presuming a developer takes advantage of it.
An example VR scene, before and after traditional post-processing for a VR headset display. Source: Nvidia
SMPE and its effects on VR don’t end there, however. The technology also allows developers to take advantage of a feature called Lens Matched Shading, or LMS for short. Prior to Pascal, graphics cards had to render the first pass of an image for a VR viewport assuming a flat projection. Because VR headsets rely on distorting lenses to create a natural-looking result, however, a pre-distorted image then has to be produced from the flat initial rendering to create a final scene that looks correct through the headset. This step throws away data. Nvidia says that a traditional graphics card might start with a 2.1MP image to begin with for a VR scene, but after post-processing, that image might be only 1.1MP. That’s a huge amount of extra work for pixels that are just going to be discarded.
An example of Lens Matched Shading in action. Source: Nvidia
LMS, on the other hand, takes advantage of the SMPE to render a scene more efficiently. It first slices the viewport into quadrants and then uses each of those to generate an associated projection that’s close to that of the part of the lens that will eventually be used to view the image. With this multi-projection rendering, the preliminary image in Nvidia’s example is just 1.4MP before it goes through the final post-processing step—a major increase in efficiency.
A grab bag of other improvements and changes
The GTX 1080 and the Pascal architecture introduce a number of smaller improvements and changes, as well. We won’t be covering these in depth today, but some of them are worth taking a brief look at. For more information, we’d recommend checking out Nvidia’s excellent GeForce GTX 1080 whitepaper.
Nvidia notes that competitive gamers who run titles like Counter-Strike: Global Offensive at high frame rates often leave v-sync off to let the graphics card run as fast as possible and to minimize input latency, at the expense of introducing tearing. When you’re rendering frames at multiple hundreds of FPS, it makes sense that tearing would be rampant. Fast Sync is a new frame output method that’s meant to eliminate tearing while maintaining most of the competitive benefits that running with vsync off entails. To accomplish that ideal, Nvidia says it decoupled the rendering and display stages of the graphics pipeline. With Fast Sync on, the card can still render frames as fast as possible, but it’ll only send completed frames to the display, avoiding tearing.
To make that principle work, the Fast Sync logic adds a third buffer—the “last rendered buffer”—to the traditional front- and back-buffers of a graphics pipeline with vsync on. This new buffer contains the last complete frame written to the back buffer. It holds this frame until the front buffer finishes sending a frame to the display, at which point it’s renamed to the front buffer and the display begins writing out the completed frame held within.
Nvidia emphasizes that no copying between buffers occurs in this process. Rather, the company notes that it’s much more efficient to simply rename buffers on the fly. Once the last rendered buffer becomes the front buffer and scanout begins, a bunch of buffer-naming musical chairs occurs in the background while the display is scanning out so that rendered frames have places to go in the meantime. When the display scanout completes and the music stops, whichever buffer had assumed the role of the “last rendered buffer” prior to that point becomes the front buffer, and the cycle repeats. Nvidia says new flip logic in Pascal is responsible for managing this process.
Fast Sync adds a bit of latency to the gameplay experience, as the annoyingly vague chart above purports to show. Still, as someone who’s exceptionally sensitive to tearing, I’d welcome slightly more input latency in trade for banishing that ugly visual artifact from my life.
To be clear, Fast Sync is not a replacement for G-Sync or FreeSync variable-refresh-rate monitors—it’s an interesting but separate complement to those technologies. We’ll need to play with this tech and see how it works in practice.
SLI and Pascal
We’ve already covered the changes to SLI that Nvidia is making with its Pascal cards. To recap, the company is discontinuing internal development of SLI profiles for three- and four-way SLI setups. It’s instead putting its weight entirely behind two-way SLI setups, instead. Extreme benchmarkers will still be able to get three- and four-way SLI profiles for use with apps like 3DMark, but for all intents and purposes, two-way SLI is the way of the future.
HB SLI bridges. Source: Nvidia
Running two-way SLI at its maximum potential with the GTX 1080 requires a new “high-bandwidth” SLI bridge that links both sets of SLI “fingers” present on GTX 1080s. With the proper bridge, the GTX 1080’s SLI link runs at 650MHz. Older “LED SLI bridges” will also run at this speed, but the ribbon-cable bridge included with many motherboards will only run at 400MHz with Pascal cards. Nvidia says the net result of this change is a doubling in SLI bandwidth compared to past implementations of the technology.
If Nvidia’s internal numbers are to be believed, HB SLI has tangible benefits on in-game smoothness. The company ran Shadows of Mordor on an 11520×2160 display array to show off the feature, and the frame time plot of that benchmark suggests that the added bandwidth helps reduce worst-case latency spikes.
HDR content and higher-res display support
One of the major updates that AMD has been touting for its next-gen Radeons has been a bevy of features related to high-dynamic-range gaming and video playback, and Pascal appears just as ready for that next-generation content.
Nvidia’s Maxwell cards already came with support for 12-bit color, the BT.2020 wide color gamut, and the SMPTE 2084 electro-optical transfer function (EOTF). Pascal adds support for 60Hz 4K HEVC video decoding with 10- or 12-bit color, 60Hz 4K HEVC encoding with 10-bit color for recording or streaming HDR content, and DisplayPort 1.4’s metadata transport spec for HDR over that connection. Pascal cards will also be able to perform game-streaming in VR with a compatible device, like Nvidia’s Shield Android TV.
Nvidia is also beefing up high-res display support with Pascal and the GTX 1080. While the new card can still run only four active displays, it can now run monitors that max out at 8K (7680×4320) using two DisplayPort 1.3 cables. The GTX 1080 supports HDMI 2.0b with HDCP 2.2, and though it’s just DisplayPort 1.2-certified right now, it’s ready for the upcoming, higher-bandwidth DisplayPort 1.3 and DisplayPort 1.4 standards.
Now that we’ve covered some of the biggest changes in Pascal for consumers, let’s talk about the GTX 1080 Founders Edition card itself.
The GeForce GTX 1080 Founders Edition
When Nvidia first announced the GeForce GTX 1080, the company introduced a new concept called a “Founders Edition” card. At the time, this new name gave rise to speculation the card would have some kind of special sauce inside, but we now know that it’s really just a different designation for what we used to call “reference coolers.” The Founders Edition card also has a $699.99 suggested price tag, a $100 premium over the $599.99 suggested price for custom cards from Nvidia’s board partners.
|GeForce GTX 1080||1607 MHz||1733 MHz||2560||8GB GDDR5X||1x 8-pin||180W||$699.99|
The GeForce GTX 1080 Founders Edition has a pretty standard Nvidia reference board look, much like the GeForce GTX 980 Ti before it. Nvidia devised a fancy new polygonal design for the aluminum cooler shroud, but even with the new aesthetics, it could hide among other recent Nvidia reference boards without arousing too much suspicion.
The new reference cooler’s internals are pretty similar to what’s come before. The heatsink that the blower-style fan cools looks the same as those inside the company’s older, higher-end reference cards. Nvidia touts this “vapor chamber” heatsink as something special, and the card does appear to use one, but that same basic heatsink has been present in cards dating back to the GTX 780, at least.
The one major upgrade from past Nvidia reference designs is the inclusion of a backplate on the card, which is a nice touch. The backplate is a pretty standard plastic-coated metal job.
After removing the million tiny screws that hold it in place, the backplate comes off to reveal the PCB itself. The four Philips-head screws hold down the heatsink on the GPU, which thankfully doesn’t require that the whole card be disassembled to be removed. The cooler assembly itself comes off as a unit.
Back to the front of the board, and few hex-key screws later, and we can pull off the acrylic shroud and the metal trim piece that cover the heatsink. With the four Philips screws removed from the back, we can fully expose the GPU die.
After removing even more screws, we can remove the shroud and see the exciting bits of the 1080. This view shows the GP104 die and the GDDR5X RAM that rings it. We also get a look at the 5+1-phase power delivery subsystem of the card, plus the single eight-pin PCIe power connector the 1080 uses to get the extra juice it needs from the PSU.
We were able to successfully reassemble our Founders Edition card without too much trouble after dissecting it. While we’ll be moving on to testing next, we actually performed this disassembly after we concluded our benchmarking, so the results you’ll see in the following pages all come from a factory-fresh GTX 1080. Let’s see what it can do.
Our testing methods
As always, we did our best to deliver clean benchmarking results. Our test system was configured as follows:
|Motherboard||Asus X99 Deluxe|
|Memory size||16GB (4 DIMMs)|
|Memory type||Corsair Vengeance LPX
DDR4 SDRAM at 3200 MT/s
|Chipset drivers||Intel Management Engine 18.104.22.1685
Intel Rapid Storage Technology V 22.214.171.1241
|Audio||Integrated X99/Realtek ALC1150
Realtek 126.96.36.19925 drivers
|Hard drive||Kingston HyperX 480GB SATA 6Gbps|
|Power supply||Fractal Design Integra 750W|
|OS||Windows 10 Pro|
|Driver revision||GPU base
|Asus Strix Radeon R9 Fury||Radeon Software 16.6.1||–||1000||500||4096|
|Radeon R9 Fury X||Radeon Software 16.6.1||–||1050||500||4096|
|Gigabyte Windforce GeForce GTX 980||GeForce 368.39||1228||1329||1753||4096|
|MSI GeForce GTX 980 Ti Gaming 6G||GeForce 368.39||1140||1228||1753||6144|
|GeForce GTX 1080||GeForce 368.39||1607||1733||2500||8192|
Our thanks to Intel, Corsair, Asus, Kingston, and Fractal Design for helping us to outfit our test rigs, and to Nvidia and AMD for providing the graphics cards for testing, as well.
For our “Inside the Second” benchmarking techniques, we use the Fraps software utility to collect frame-time information for each frame rendered during our benchmark runs. We sometimes use a more advanced tool called FCAT to capture exactly when frames arrive at the display, but our testing has shown that it’s not usually necessary to use this tool in order to generate good results for single-GPU setups. We filter our Fraps data using a three-frame moving average to account for the three-frame submission queue in Direct3D. If you see a frame-time spike in our results, it’s likely a delay that would affect when a frame reaches the display.
You’ll note that aside from the Radeon R9 Fury X and the GeForce GTX 1080, our test card stable is made up of non-reference designs with boosted clock speeds and beefy coolers. Many readers have called us out on this practice in the past for some reason, so we want to be upfront about it here. We bench non-reference cards because we feel they provide the best real-world representation of performance for the graphics card in question. They’re the type of cards we recommend in our System Guides, so we think they provide the most relatable performance numbers for our reader base.
To make things simple, when you see “GTX 980” or “GTX 980 Ti” in our results, just remember that we’re talking about custom cards, not reference designs. You can read more about the MSI GeForce GTX 980 Ti Gaming 6G in our roundup of those custom cards. We also reviewed the Gigabyte Windforce GeForce GTX 980 a while back, and the Asus Strix Radeon R9 Fury was central to our review of that GPU.
Each title we benched was run in its DirectX 11 mode. We understand that DirectX 12 performance is a major point of interest for many gamers right now, but the number of titles out there with stable DirectX 12 implementations is quite small. We had trouble getting Rise of the Tomb Raider to even launch in its DX12 mode, and other titles like Gears of War: Ultimate Edition still seem to suffer from audio and engine timing issues on the PC. DX12 also poses challenges for data collection that we’re still working on. For a good gaming experience today, our money is still on DX11.
Finally, you’ll note that in the titles we benched at 4K, the Radeon R9 Fury is absent. That’s because our card wouldn’t play nicely with the 4K display we use on our test bench for some reason. It’s unclear why this issue arose, but in the interest of time, we decided to drop the card from our results. Going by our original Fury review, the GTX 980 is a decent proxy for the Fury’s performance, which is to say that it’s not usually up to the task of 4K gaming to begin with. You can peruse those numbers and make your own conclusions.
Sizing ’em up
Take some clock speed information and some other numbers about per-clock capacity from the latest crop of high-end graphics cards, and you get this neat table:
|Radeon R9 290X||64||176/88||4.0||5.6||320|
|Radeon R9 Fury||64||224/112||4.0||7.2||512|
|Radeon R9 Fury X||67||269/134||4.2||8.6||512|
|GeForce GTX 780 Ti||37||223/223||4.6||5.3||336|
|Gigabyte GTX 980||85||170/170||5.3||5.4||224|
|MSI GeForce GTX 980 Ti||108||216/216||7.4||6.9||336|
|GeForce Titan X||103||206/206||6.5||6.6||336|
|GeForce GTX 1080||111||277/277||6.9||8.9||320|
Those are theoretical peak capabilities for each of the measures above. We won’t be testing every card in the table, but we’re leaving some older cards in to show how far we’ve come since Kepler. As you can see, the GTX 1080 provides a nice increase in pretty much every measure over GM204 and the GTX 980, and it’s even better in some regards than the GM200 GPU on board the Titan X and GTX 980 Ti. Let’s see how our calculations hold up with some tests from the Beyond3D suite.
The GTX 1080 has the same number of ROPs as the GTX 980, but its substantially higher clocks and higher SM count allow it to deliver a substantial increase in pixel fill rate over that card, pushing past even the GM200-powered GeForce GTX 980 Ti. Good grief.
This bandwidth test measures GPU throughput using two different textures: an all-black surface that’s easily compressed and a random-colored texture that’s essentially incompressible. Throw an incompressible texture at the GTX 1080, and it produces a nice boost over the GTX 980. The GTX 980 Ti still comes pretty close, though, and the Fiji cards pull ahead. Once the card can take advantage of its compression mojo, however, the amount of throughput gets a little ridiculous. It appears the new delta-color-compression techniques Nvidia implemented in Pascal are definitely doing their thing.
All of the graphics cards tested come close to hitting their peak texture-filtering rate in this test. The GTX 1080 edges out the prodigious power of the R9 Fury X here, and it holds that lead both with simple and more complicated formats. It also speeds way past the GTX 980 and GTX 980 Ti, for the most part. In fact, we can already say that the GTX 980 Ti is the GTX 1080’s most natural competitor in these tests—the GTX 980 just can’t keep up.
As we’ve seen in past reviews, our GeForce cards actually slightly exceed their theoretical peaks in this polygon throughput test—substantially so, in the case of the GeForce GTX 1080. We’ve guessed that this test is especially amenable to GeForces’ GPU Boost feature in the past, so it’s possible the cards are just running really fast. Regardless, the GTX 1080 turns in some impressive numbers.
The situation is more normal in our ALU throughput tests, where all of the cards more or less hit their peak theoretical numbers.
All told, the GeForce GTX 1080 is an exceptionally potent graphics card by every theoretical measure we can throw at it. Let’s see how that performance carries over to some real games.
Grand Theft Auto V
Grand Theft Auto V has a huge pile of image quality settings, so we apologize in advance for the wall of screenshots. We’ve re-used a set of settings for this game that we’ve established in previous reviews, which should allow for easy comparison to our past tests. GTA V isn’t the most demanding game on the block, so even at 4K you can expect to get decent frame times out of higher-end graphics cards.
These “time spent beyond X” graphs are meant to show “badness,” those instances where animation may be less than fluid. The 50-ms threshold is the most notable one, since it corresponds to a 20-FPS average. We figure if you’re not rendering any faster than 20 FPS, even for a moment, then the user is likely to perceive a slowdown. 33 ms correlates to 30 FPS or a 30Hz refresh rate. Go beyond that with vsync on, and you’re into the bad voodoo of quantization slowdowns. And 16.7 ms correlates to 60 FPS, that golden mark that we’d like to achieve (or surpass) for each and every frame.
The one little hitch in the GTX 1080’s frame time graph above causes it to spend 4ms beyond the 33.3 ms mark— barely worthy of note. The Fury X delivers similar performance. What’s really spectacular about the GTX 1080 is that it spends just 76 ms beyond that golden 16.7-ms mark, meaning we can be assured of a near-constant 60 FPS. With those smooth frame times, most of GTA V‘s gameplay experience on the GTX 1080 is fantastic at 4K. Even with a variable-refresh-rate display, the difference in fluidity between the GTX 1080 and the GTX 980 Ti can be felt in normal gameplay.
All of our contenders commendably spend no time past the 50-ms mark, and only the Radeon R9 Fury spends a significant number of milliseconds past the 33.3-ms hurdle. Move down to the 16.7-ms barrier, though, and nothing comes close to the GTX 1080’s smoothness. Even the GTX 980 Ti struggles a bit here.
Crysis 3 is an old standby in our benchmarks. Even though the game was released in 2013, it still puts the hurt on high-end graphics cards. Unfortunately, our AMD Radeon R9 Fury and my 4K display have a disagreement of some sort, so the red team is only represented by the Fury X on this set of benches.
Would you look at that? The GTX 1080 is the only card that doesn’t show the significant frame-time “fuzziness” typical of inconsistent frame delivery throughout our Crysis 3 run. The average FPS metric is right where we’d like to see it for smooth gameplay, too. Our 99th percentile numbers suggest that there’s more to that 60-FPS average than meets the eye, though.
Even running this game at 4K, it’s hard to talk about “badness” with the 1080. We didn’t manage to catch a single frame taking longer than 50ms, or even one that took 33.3 ms. Once we hit the 16.7 ms mark, however, we can at least prove that the charts are still working. Here, we can see that the GTX 1080 spends a fair bit of time past 16.7ms working on frames: about 2.7 seconds. For comparison, though, the GeForce GTX 980 Ti spends almost 10 seconds past the 16.7 ms mark, and the R9 Fury X has to work even harder.
This major difference in our “badness” metric makes the GTX 1080 the first card we can really call smooth in Crysis 3 at 4K with high settings. The GTX 980 Ti we used is perfectly playable at these settings, but the subjective difference in smoothness between the two cards is definitely noticeable.
Rise of the Tomb Raider
Rise of the Tomb Raider is the first brand-new game in our benchmarking suite. To test the game, I romped through one of the first locations in the game’s Geothermal Valley level, since it offers a diverse backdrop of snow, city, and forest environments. RotTR is a pretty demanding game, so I took this opportunity to dial the resolution back to 2560×1440. We also turned off the AMD PureHair feature to avoid predisposing the benchmark toward one card or another in this test, since Nvidia’s HairWorks has created significant performance deltas in past Tomb Raider games when we’ve had it on.
Our frame-time chart and 99th percentile chart make two things pretty clear about this test. For one, the GeForce GTX 1080 performs admirably. For two, this game seems to play quite well with Nvidia’s cards. The delta between the 980 Ti and the 1080 in 99th percentile frame time isn’t huge, but it’s still worthy of note. For a clearer comparison, we need to look at the specific frame-time thresholds.
Well, that’s more like it. The 99th percentile frame time alone hides an important point of comparison between the two fastest cards in this metric. While all three of the GeForces we tested keep their frame times below 33.3ms, the GTX 980 Ti spends much more time past 16.7ms than the GTX 1080. This test indicates that you don’t have to have a 4K display to benefit from the added grunt of the 1080, presuming you want to turn those quality sliders way up.
Even at 4K with its settings turned up, Fallout 4 just isn’t the most demanding game out there. The GTX 1080 makes pretty handy work of it. Once again, we see 99th-percentile frame times just barely above the magic 16.7ms mark. (Fallout 4 normally has a 60-FPS cap on by default, but we disabled it for these tests.)
No matter which way you squint at it, the GTX 1080 does a fantastic job of running Fallout 4 at respectable frame rates. The GTX 980 Ti also does a pretty respectable job, although it spends about three times as much time past the 16.7ms mark. In absolute terms, though, you’d be hard-pressed to notice frame-delivery roughness from either card. The Fury X struggles a bit here, and the GTX 980 brings up the rear. Once again, the GTX 1080 is the card to beat for smooth gaming at 4K in our tests.
The Witcher 3
The Witcher 3 is another benchmark where I re-used the settings that we’ve settled on in past reviews. We also chose to test this title at 2560×1440, rather than 4K. We didn’t crank the resolution in part because we wanted to maintain consistency with the numbers we produced in our Fury X review, but also because the game is demanding enough that playing the game at 4K with high settings wasn’t a great experience even on newer high-end cards.
The GTX 1080 is rapidly depleting my reserve of ways to say “well, that went well.” Its average FPS numbers for this test are impressive, and its 99th-percentile frame time number hovers just above the magic 16.7ms mark. Remember that as 99th percentile times get lower, it gets harder to push down the absolute values. While the difference between the GTX 980 Ti’s 99th-percentile result and that of the GTX 1080’s is only 1.5 ms, that difference can still have a noticeable effect on smoothness.
Sorry, but the numbers above already tell the story here. The GTX 1080 is smooth as butter. In fact, the most interesting thing to note about these results is that the Fury X has significantly improved its showing since our initial review, both in its 99th-percentile frame times and in its “badness” performance. Even so, it can’t catch the GTX 980 Ti or the GTX 1080 for smoothness. The GTX 1080 spends just 71 ms past the 16.7ms mark in The Witcher 3, and that makes for some wonderfully smooth gameplay on Nvidia’s latest.
The 2016 version of Hitman closes out our test suite. We chose to bench this demanding title at 4K to really make our graphics cards sweat.
The GTX 1080 finally struggles a bit in this test, but so do the rest of our sample cards. The GTX 1080 opens up about the same lead over the GTX 980 Ti we’ve seen in our other tests, but neither card turned in a particularly impressive 99th-percentile frame time, all things considered. The Radeon R9 Fury X seems to be making a good showing based on FPS alone, but its relatively spiky frame-time plot tells a different story.
Going by our “badness” metrics, the GTX 1080 doesn’t quite turn in the glass-smooth performance it has in most of our other games. It’s still smoother than the other cards we tested, though, and by a significant margin. The 980 Ti and the Fury X both produce admirable average FPS figures, but they spend quite a bit of time churning away on frames that can’t be completed in under 16.7 ms. That data is consistent with the 99th-percentile frame times we gathered.
While these numbers might be discouraging for readers hoping for playable frame rates at 4K in Hitman, lowering the resolution and graphics settings to a saner level produces predictably better performance, going by our informal tests. That said, it is clear that we can’t just assume that the GTX 1080 will be able to deal with whatever we throw at it without breaking a sweat.
Let’s take a look at how much juice the GTX 1080 needs to do its thing. Our “under load” tests aren’t conducted in an absolute peak scenario. Instead, we have the cards running a real game, Crysis 3, in order to show us power draw with a more typical workload.
Here we can see another advantage of Pascal’s 16-nm lithography. Not only does the 1080 roundly outperform the rest of the cards we tested, it does so while using far less power. In our Crysis power-test run, peak power draw on the GTX 1080 was quite a bit lower than any of its competitors. Nvidia isn’t giving up the efficiency crown with this new generation of chips, to be certain.
Noise levels and GPU temperatures
Thanks to the move to a slightly different test rig, the noise floor on our test system is rather high thanks to its closed-loop liquid cooler. Even with a passive graphics card, just the pump and its fan left us with a 40-dBA noise floor. Still, some cards showed a significant increase from that floor when under load. Our test rig also happens to be down in sunny Alabama, and thanks to high temperatures and a creaky old house, the ambient temperature in our testing environment was about 80° F (or about 27° C), so the zero-point for our temperature numbers is a bit higher than in previous reviews. Those caveats aside, let’s see how loud and hot our cards get under load.
The Founders Edition cooler on the GTX 1080 sadly doesn’t do a great job of keeping the GPU underneath cool, or even all that quiet. The card’s load noise levels are only exceeded by the triple-fan cooler on the Gigabyte Windforce GTX 980 we have on hand, and its load temperatures are the worst of the pack by a wide margin. The sound from the single blower-style fan also has an unpleasant grinding quality, something our absolute noise measurements can’t convey. The Fury X produces a similarly unusual and annoying sound: a high-pitched whine that we’ve picked up on in past reviews.
Now’s as good a time as any to talk about the GTX 1080’s overclocking potential. While the silicon lottery certainly plays a role, it’s equally important to have a good cooler strapped onto the GPU you’re trying to tweak. While the GP104 chip itself might have plenty of overclocking potential on tap, we’d be wary of trying to push it too far with the Founders Edition cooler given our stock-clocked results. Plenty of Nvidia’s board partners are now selling GeForce GTX 1080s, and the custom coolers on those boards might unlock the thermal headroom one would want to really push the clocks skyward. For now, we’re reserving judgment on the GTX 1080’s overclocking prowess.
Before we dive into our conclusions, let’s take a look at our famous value scatters to see what kind of performance the GeForce GTX 1080 gets you—at least, in its Founders Edition form. Since prices have fallen on the GeForce GTX 980, GeForce GTX 980 Ti, and the Radeon R9 Fury X since their launches, we’ve surveyed all of the in-stock cards of each model available from Newegg right now and averaged their prices to present what we feel is a fair picture of the high-end graphics market today. (Since the Radeon R9 Fury couldn’t participate in all our tests, we’re leaving it off these charts.) The best values in these charts congregate toward the top left corner, where performance is high and prices are low.
First, let’s look at the potential performance that each card has on tap in the form of average FPS per dollar. Surprising nobody who’s read the past few pages, the GTX 1080 rockets to the top right corner of the chart. If you’re already in possession of a hot GTX 980 Ti, the GTX 1080 Founders Edition doesn’t offer a giant step up in performance, but it’s still a significant one. Our results suggest you really want a GTX 1080 to game smoothly at 4K with the titles we tested, and this card also has an advantage in demanding games running at 2560×1440.
Next, let’s crunch some numbers from our advanced frame-time metrics to determine how smooth a gaming experience the GTX 1080 delivers for its price tag. To make our “higher is better” arrangement work with frame times, we’ve converted the geometric mean of each card’s 99th-percentile frame times in our tests into an FPS value.
No surprises here, either. The GTX 1080’s 99th-percentile-FPS-per-dollar figure is so high that we had to add a rung to our chart to make it visible. It delivers unparalleled smoothness in our tests. Again, the GTX 980 Ti isn’t too far behind, but you are getting significantly smoother frame delivery for your money when you pay for a GTX 1080. Meanwhile, the Radeon R9 Fury X doesn’t deliver gameplay that’s any smoother than the GeForce GTX 980 we tested. Both of those cards are considerably outclassed by Nvidia’s latest.
While the GTX 1080 Founders Edition does deliver world-beating performance for a single-GPU card, it leaves a bit to be desired in the noise, vibration, and harshness department. Its blower-style heatsink isn’t particularly quiet or pleasant-sounding, and it also doesn’t keep the GP104 chip all that cool under load. We have some difficulty accepting those facts. $700 is a lot of money for a graphics card, and considering the premium that Nvidia charges for these cards over ones with custom third-party coolers and factory tuning jobs, we don’t think they represent a great value. Sure, the new reference heatsink design looks nice, but we don’t think good looks are enough reason for buyers to fork over the extra cash.
Happily, Nvidia’s board partners are starting to deliver a diverse range of custom-cooled GTX 1080s themselves, and our experience with those cards so far has been positive. GTX 1080 custom jobs can cost significantly less than the Founders Edition, and they tend to come with beefier heatsinks and factory clock boosts. Unless you’re really into the Founders Edition look, we think most will be happier with one of these hot-rodded GTX 1080s. Those cards technically carry a $599.99 suggested price, but the GTX 1080’s popularity and low stock have conspired to bring the retail prices for most of those hot rods closer to the Founders Edition’s $700 sticker. If you’ve gotta have a GTX 1080 now, though, we think you still ought to opt for a custom card. They’re just better values.
No matter what flavor of GTX 1080 buyers end up with, Nvidia deserves high praise for pushing the envelope of graphics performance to new heights with the GTX 1080, the GP104 GPU, and the move to next-generation process tech. Even better, the company is offering that performance in a relatively affordable package for a high-end graphics card. If you’ve got a big wad of cash burning a hole in your pocket from the long pause at the 28-nm process node, you’ll be richly rewarded by the smoothness and performance that the GTX 1080 offers if you choose to spend it now.