Several software updates
Nvidia has just recently rolled out a new release of its GeForce graphics drivers, R320, ahead of the GTX 780's introduction. Those drivers include the usual sort of performance increases one might expect, including optimizations for Tomb Raider and Metro: Last Light. Along with those updates, the R320 drivers have some deep voodoo intended to reduce the sort of performance problems we now track with regularity: frame time variations. These drivers should help even out frame delivery on GeForce cards, although Nvidia won't say exactly why or how they achieve that goal. Hmmm.
An even bigger change is happening today. After being downloaded over 2.5 million times during testing, Nvidia's GeForce Experience software is leaving the beta stage and going official. Immediately, it becomes Nvidia's recommended software option for GeForce owners. For the uninitiated, GeForce Experience does a couple of things for gamers automatically. The program helps manage the downloading and installation of graphics driver updates, replacing old-school manual driver downloads with an automated tool.
Also, GeForce Experience will scan your system for installed games, read in their current image quality settings, and recommend optimal settings for your GPU. You can see an example above from our test system where it's recommending a switch to TXAA anti-aliasing. If you click the "Optimize" button, the GFE software will write its optimized settings to disk, so the game will start up with those options the next time it runs. Nvidia has taken the time to profile a host of games on its GPUs in order to make this sort of automation possible. For the average gamer who's probably befuddled by the choice between SSAO, HBAO, and HDAO, this sort of thing could be incredibly helpful. Even for folks who consider themselves experts, I'd consider these profiles a worthwhile resource. Folks are, of course, free to deviate from Nvidia's recommended settings if they so choose.
A couple of other nifty features are on the horizon for GFE in the near future. First, when Nvidia's Shield handheld gaming system arrives next month, GFE will act as the server software for remote gaming sessions. The idea here is that, in a sort of "local cloud" config, the Shield handheld can control a game running on your home PC. The visuals will be streamed to Shield over the network in real time, after being compressed via the GPU's H.264 video encoding hardware. This I want.
The other addition will probably prove to be even more popular. Nvidia's calling it ShadowPlay, and the concept is simple. Folks who record their gaming sessions for others to watch will know that recording via Fraps or streaming a session via other tools can involve lots of performance overhead. ShadowPlay will use the H.264 video encoding block built into any Kepler GPU to enable in-game recording with minimal performance impact. In fact, the overhead is low enough that Nvidia touts ShadowPlay's potential as an "always on" recording feature. Imagine being able to allocate disk space so that the last 20 minutes of your latest gaming session are always available. That's the idea here.
I'd like to give you more details about exactly how ShadowPlay will work, but I haven't been able to try it yet. Nvidia tells us ShadowPlay is "coming this summer," so it's still in development right now. I think there's some possibility that game streaming services might be supported eventually, in addition to local recording. If that's something you'd like to have, you might want to post something in the comments about it.
That's it for the GeForce Experience, but Nvidia has made one other software change worth mentioning. Both the GTX Titan and the GTX 780 have version 2.0 of Nvidia's Boost dynamic clocking routine. Boost allows Nvidia's Kepler-based graphics cards to operate at higher clock speeds by monitoring a range of inputs including GPU utilization, power draw, and temperature. Version 2.0 of Boost debuted with the GTX Titan, and it introduced a new algorithm that keys on GPU temperatures, rather than power draw, when making its decisions about what clock speeds to choose. This algorithm opened up more frequency headroom for Titan.
Nvidia initially exposed a host of tweaking options for end users, accessible via tools like EVGA's Precision, so folks could overclock their Titans with Boost 2.0 active. Trouble is, the Boost 2.0 algorithm is very complex, with lots of inputs determining whether it's safe for GPU clock speeds to rise. Users could have the off-putting experience of cranking up the GPU clock speed slider, expecting to get MOAR POWER, only to see little or nothing happening because of some other limitation.
To rectify this situation, Nvidia has now exposed some additional variables so users can understand which factor is currently constraining GPU clock speeds. The screenshot above from EVGA Precision shows the relevant limits on our GTX 780 card while running a simple graphics workload. If one of these numbers turns from a 0 to 1, it has become the limiting factor for GPU frequencies. As you can see, in our example, the GPU voltage limit is currently keeping clock speeds in check. If we want higher clocks, we'll need to overvolt our GPU a bit. This addition should take some of the mystery out of attempting to overclock a Boost 2.0-enabled GPU like the GTX 780 or Titan.
The results you'll see on the following pages are not like those from old-school video card reviews, as our regular readers will know. Instead, we analyze the time needed to render each and every frame of animation. For an intro to this approach, you should probably start with this article. You'll also notice that we're capturing frame times with two different tools, Fraps and FCAT. These tools sample at different points in the frame production pipeline—Fraps near the beginning and FCAT at endpoint, the display. The results from both are potentially important, especially when they disagree. For an explanation of these tools, see here.
Fortunately, our task for today is relatively straightforward, as these things go. After diving deep into the issue of multi-GPU microstuttering in my Radeon HD 7990 review, I've elected to concentrate today on single-GPU solutions. I think that's the right call for now, since dual-GPU Radeon configs aren't likely to pose much of a challenge to the GTX 780—not unless and until AMD releases a public driver with the frame-pacing capability we explored in the 7990 review.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Our test systems were configured like so:
|Chipset||Intel X79 Express|
|Memory size||16GB (4 DIMMs)|
DDR3 SDRAM at 1600MHz
|Memory timings||9-9-9-24 1T|
|Chipset drivers||INF update
Rapid Storage Technology Enterprise 22.214.171.1249
with Realtek 126.96.36.19962 drivers
|Hard drive||OCZ Deneva 2 240GB SATA|
|Power supply||Corsair AX850|
|OS||Windows 7 Service Pack 1|
|GeForce GTX 680||GeForce 320.18 beta||1006||1059||1502||2048|
|GeForce GTX 780||GeForce 320.18 beta||863||902||1502||3072|
|GeForce GTX Titan||GeForce 320.14 beta||837||876||1502||6144|
|Radeon HD 7970 GHz||Catalyst 13.5 beta 2||1000||1050||1500||3072|
Thanks to Intel, Corsair, Gigabyte, and OCZ for helping to outfit our test rigs with some of the finest hardware available. AMD, Nvidia, and the makers of the various products supplied the graphics cards for testing, as well.
Also, our FCAT video capture and analysis rig has some pretty demanding storage requirements. For it, Corsair has provided four 256GB Neutron SSDs, which we've assembled into a RAID 0 array for our primary capture storage device. When that array fills up, we copy the captured videos to our RAID 1 array, comprised of a pair of 4TB Black hard drives provided by WD.
Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.
In addition to the games, we used the following test applications:
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
|Here's the not-so-live video version of The TR Podcast 164||10|
|Here's what's cooking in Damage Labs||13|
|Deal of the week: An IPS ultra-wide for $420, plus cheap SSDs and more||13|
|Microsoft's quarterly revenue up 25% on strong Surface, Xbox sales||18|
|Assassin's Creed Unity PC requires 6GB of RAM, GTX 680||204|
|Join us as we attempt to live stream The TR Podcast tonight||13|
|Civ: Beyond Earth with Mantle aims to end multi-GPU microstuttering||65|
|CPU startup claims to achieve 3x IPC gains with VISC architecture||59|
|I just found this AMAZING trick! Call of Duty takes up 0GB if you just don't buy it!||+114|