Crysis 3 easily stresses these video cards at its highest image quality settings with 4X MSAA. You can see a video of our test run and the exact settings we used below.
Here's where things become a little bit (more?) complicated. You'll see in the graphs below that I've included two results from two different tools, Fraps and FCAT. As I noted last time out, both tools are essential to understanding animation smoothness, particularly in multi-GPU configs where the possibility of microstuttering looms in the backdrop.
Fraps does its sampling early in the frame production pipeline, so its timing info should correspond pretty closely to what the game engine sees—and the game engine determines the content of the frames by advancing its physical simulation of the game world. An interruption in Fraps frame times, if it's large enough, will yield a perceptible interruption in animation, even if buffering and frame metering later in the pipeline smooths out the delivery of frames to the display. We saw an example of this phenomenon in our last article, and we've seen others in our own testing.
Meanwhile, FCAT tells you exactly when a frame arrives at the display. An interruption there, if it's large enough to be perceptible, will obviously cause a problem, as well. Because it monitors the very end of the frame production pipeline, FCAT may show us problems that software-based tools don't see.
One complication we ran into is that, in most newer games, Fraps and the FCAT overlay will not play well together. You can't use them both at the same time. That means we can't provide you with neat, perfectly aligned data from the same five test runs per card, captured with both Fraps and FCAT. What we've done instead is conduct three test runs with FCAT and three with Fraps. We've then mashed the two together. The plots you'll see come from the second of the three test runs for each card and test type. Since the Fraps and FCAT plots come from different test sessions, conducted manually, the data in them won't align perfectly. Both tools should be producing results that are correct for what they measure, but they are measuring different test runs and at different points in the pipeline.
Since we're looking at the question of microstuttering, I've included zoomed-in snippets of our frame time plots, so that we can look at those jitter patterns up close. Note that you can switch between all three plots for each GPU by pressing one of the buttons below.
If you click through the different GPUs, several trends become apparent. Even though they're from different test runs, the Fraps and FCAT data for the single-GPU products tend to be quite similar. As we've seen before, there's a little more variability in the Fraps frame times, but the timing gets smoothed out by buffering by the time FCAT does its thing.
For the multi-GPU cards like the GTX 690 and 7990, though, the Fraps and FCAT results diverge. That familiar microstuttering jitter is apparent in the Fraps results from the 7990, and the swing from frame to frame grows even larger by the time those frames reach the display. This is the sort of situation we'd hoped to avoid. The "long" frame times on the 7990 reach to 70 ms and beyond—the equivalent of 14 FPS—while the shorter frame times in FCAT are well under 10 ms. When you're waiting nearly 70 milliseconds for every other frame to come along, you're not talking about creamy smooth animation.
The GTX 690 isn't immune to microstuttering issues, either. A pronounced jitter shows up in its Fraps results, even though it's smoothed out by SLI frame metering before the frames reach the display. The impact of frame metering here is pretty remarkable, but it's not a cure-all. Frames are still being generated according to the game engine's timing. As I understand it, the vast majority of game engines tend to sample the time from a Windows timer and use that to decide how to advance their simulations. As a result, the content of frames generated unevenly would advance unevenly, even if the frames hit the display at more consistent intervals. The effect would likely be subtle, since the stakes here are tens of milliseconds, but it's still less than ideal.
The results for both of the multi-GPU cards illustrate an undesirable trait of the microstuttering problem. As frame times grow, so does the opportunity for frame-to-frame jitter. The 7990 seems to be taking especially full advantage of that opportunity. This is one reason why we wanted to test SLI and CrossFire in truly tough conditions. Looks like, when you need additional performance the most, multi-GPU configs may fail most spectacularly to deliver.
The 7990 wins the FPS beauty contest, confirming its copious raw GPU power, if nothing else. The 99th percentile frame time focuses instead on frame latencies and, because it denotes the threshold below which all but 1% of the frames were produced, offers a tougher assessment of animation smoothness. All of the cards are over the 50-ms mark here, so they're all producing that last 1% of frames at under 20 FPS. (Told you this was a tough scenario.)
Also, the 99th percentile numbers for Fraps and FCAT tend to differ, which makes sense in light of the variations in the two sets of plots above. What do we make of them? My sense is that a good solution should avoid slowdowns at both points in the pipeline, at frame generation and delivery. If so, then true performance will be determined by the slower of the two sets of results for each GPU.
Picking out the correct points of focus in the graph above made my eyes cross, though. We can tweak the colors to highlight the lower-performance number for each card:
I think that's helpful. By this standard, the single-GPU GeForce GTX Titan is the best performer. The GTX 690 and Radeon HD 7990 aren't far behind, but they're not nearly as far ahead of the GTX 690 and 7970 as the FPS numbers suggest.
Here's a look at the entire latency curve. You can see how the 7990 FCAT curve starts out strangely low, thanks to the presence of lots of unusually short frame times in that jitter sequence, and then ramps up aggressively. By the last few percentage points, the 7990's FCAT frame times catch up to the 7970's. Although the 7990 is generally faster than the 7970, it's not much better when dealing with the most difficult portions of the test run. The GTX 690's Fraps curve suffers a similar fate compared to the Titan's, but by any measure, the GTX 690 still performs quite a bit better than the GTX 680.
This last set of results gives us a look at "badness," at those worst-case scenarios when rendering is slowest. What we're doing is adding up any time spent working on frames that take longer than 50 ms to render—so a 70-ms frame would contribute 20 ms to the total, while a 53-ms frame would only add 3 ms. We've picked 50 ms as our primary threshold because it seems to be something close to a good perceptual marker. A steady stream of 50-ms frames would add up to a 20 FPS frame rate. Go slower than that, and animation smoothness will likely be compromised.
I've taken the liberty of coloring the slower of the two results for each card here, as well, to draw our focus. The outcomes are intriguing. The 7990 spends about 40% less time above the 50-ms threshold than its single-GPU sibling, the 7970. That's a respectable improvement, in spite of everything. The gains from a single GTX 680 to the 690 are even more dramatic, in part because the single 680 performs so poorly. The Titan again comes out on top.
You're probably wondering what all of these numbers really mean to animation smoothness. We've captured every frame of animation during our FCAT testing, and I've spent some time watching the videos from the different cards back to back, trying to decide what I think. I quickly learned that being precise in subjective comparisons like these is incredibly difficult. To give you a sense of things, I've included short snippets from several of the video cards below. These are FCAT recordings slowed down to half speed (30 FPS) and compressed for YouTube. They won't give you any real sense of image quality, but they should demonstrate the fluidity of the animation. We'll start with the Radeon HD 7970:
And the 7990:
And now the GTX 690:
Finally, the Titan:
Like I said, making clear distinctions can be difficult, both with these half-speed online videos and with the source files (or while playing). I do think we can conclude that the FPS results suggesting the multi-GPU solutions are twice as fast as the single-GPU equivalents appear to be misleading. Watching the videos, you'd never guess you were seeing a "17 FPS" solution versus a "30 FPS" one; the 7990 is an improvement over the 7970, but the difference is much subtler. The same is true for the GTX 690 versus the 680. I do think the Titan comes out looking the smoothest overall. In fact, I'm more confident than ever that our two primary metrics track well with human perception after this little exercise. The basic sorting that they have done—with the Titan in the lead, the multi-GPU offerings next, and their single-GPU counterparts last—fits with my impressions.
So, are the dual-GPU cards better than their single-GPU versions? Yes, in this scenario, I believe they are. How much better? Not nearly enough to justify paying over twice the price of a 7970 for a 7990.
|Here's the not-so-live video version of The TR Podcast 164||16|
|Here's what's cooking in Damage Labs||31|
|Deal of the week: An IPS ultra-wide for $420, plus cheap SSDs and more||27|
|Microsoft's quarterly revenue up 25% on strong Surface, Xbox sales||23|
|Assassin's Creed Unity PC requires 6GB of RAM, GTX 680||233|
|Join us as we attempt to live stream The TR Podcast tonight||13|
|Civ: Beyond Earth with Mantle aims to end multi-GPU microstuttering||74|
|CPU startup claims to achieve 3x IPC gains with VISC architecture||61|
|I just found this AMAZING trick! Call of Duty takes up 0GB if you just don't buy it!||+122|