I’ve been buried in my own work while preparing our Titan review, but lots has happened in the past few weeks, as many in the industry have moved toward adopting some form of game performance testing based on frame rendering times rather than traditional FPS. Feels like we’ve crossed a threshold, really. There’s work to be done figuring out how to capture, analyze, and present the data, but folks seem to have embraced the basic approach of focusing on frame times rather than FPS averages. I’m happy to see it.
I’ve committed to writing about developments in frame-latency-based testing as they happen, and since so much has been going on, some of you have written to ask about various things.
Today, I’d like to address the work Ryan Shrout has been doing over at PC Perspective, which we’ve discussed briefly in the past. Ryan has been helping a very big industry player to test a toolset that can capture every frame coming out of a graphics card over a DVI connection and then analyze frame delivery times. The basic innovation here is a colored overlay that varies from one frame to the next, a sort of per-frame watermark.
The resulting video can be analyzed to see all sorts of things. Of course, one can extract basic frame times like we get from Fraps but at the ultimate end of the rendering pipeline. These tools also let you see what portion of the screen is occupied by which frames when vsync is disabled. You could also detect when frames aren’t delivered in the order they were rendered. All in all, very useful stuff.
Interestingly, in this page of Ryan’s Titan review, he reproduces images that suggest a potentially serious problem with AMD’s CrossFire multi-GPU scheme. Presumably due to sync issues between the two GPUs, only tiny slices of some frames, a few pixels tall, are displayed on screen. The value of ever having rendered these frames that aren’t really shown to the user is extremely questionable, yet they show up in benchmark results, inflating FPS averages and the like.
That’s, you know, not good.
As Ryan points out, problems of this sort won’t necessarily show up in Fraps frame time data, since Fraps writes its timestamp much earlier in the rendering pipeline. We’ve been cautious about multi-GPU testing with Fraps for this very same reason. The question left lingering out there by Ryan’s revelation is the extent of the frame delivery problems with CrossFire. Further investigation is needed.
I’m very excited by the prospects for tools of this sort, and I expect we’ll be using something similar before long. With that said, I do want to put in a good word for Fraps in this context.
I hesitate to do this, since I don’t want to be known as the "Fraps guy." Fraps is just a tool, and maybe not the best one for the long term. I’m not that wedded to it.
But Ryan has some strongly worded boldface statements in his article about Fraps being "inaccurate in many cases" and not properly reflecting "the real-world gaming experience the user has." His big industry partner has been saying similar things about Fraps not being "entirely accurate" to review site editors behind the scenes for some time now.
True, Fraps doesn’t measure frame delivery to the display. But I really dislike that "inaccurate" wording, because I’ve seen no evidence to suggest that Fraps is inaccurate for what it measures, which is the time when the game engine presents a new frame to the DirectX API.
Taking things a step further, it’s important to note that frame delivery timing itself is not the be-all, end-all solution that one might think, just because it monitors the very end of the pipeline. The truth is, the content of the frames matters just as much to the smoothness of the resulting animation. A constant, evenly spaced stream of frames that is out of sync with the game engine’s simulation timing could depict a confusing, stuttery mess. That’s why solutions like Nvidia’s purported frame metering technology for SLI aren’t necessarily a magic-bullet solution to the trouble with multi-GPU schemes that use alternate frame rendering.
In fact, as Intel’s Andrew Lauritzen has argued, interruptions in game engine simulation timing are the most critical contributor to less-than-smooth animation. Thus, to the extent that Fraps timestamps correspond to the game engine’s internal timing, the Fraps result is just as important as the timing indicated by those colored overlays in the frame captures. The question of how closely Fraps timestamps match up with a game’s internal engine timing is a complex one that apparently will vary depending on the game engine in question. Mark at ABT has demonstrated that Fraps data looks very much like the timing info exposed by several popular game engines, but we probably need to dig into this question further with top-flight game developers.
Peel back this onion another layer or two, and things can become confusing and difficult in a hurry. The game engine has its timing, which determines the content of the frames, and the display has its own independent refresh loop that never changes. Matching up the two necessarily involves some slop. If you force the graphics card to wait for a display refresh before flipping to a new frame, that’s vsync. Partial frames aren’t displayed, so you won’t see tearing, but frame output rates are quantized to the display refresh rate or a subset of it. Without vsync, the display refresh constraint doesn’t entirely disappear. Frames still aren’t delivered when ready, exactly—fragments of them are, if the screen is being painted at the time.
What we should make of this reality isn’t clear.
That’s why I said last time that we’re not likely to have a single, perfect number to summarize smooth gaming performance any time soon. That doesn’t mean we’re not offering much better results than FPS averages have in the past. In fact, I think we’re light years beyond where we were two years ago. But we’ll probably continue to need tools that sample from multiple points in the rendering pipeline, at least unless and until display technology changes. I think Fraps, or something like it, fits into that picture as well as frame capture tools.
I also continue to think that the sheer complexity of the timing issues in real-time graphics rendering and displays means that our choice to focus on high-latency frames as the primary problem was the right one. Doing so orders our priorities nicely, because any problems that don’t involve high-latency frames necessarily involve relatively small amounts of time and are inescapably "filtered" to some extent by the display refresh cycle. There’s no reason to get into the weeds by chasing minor variance between frame times, at least not yet. Real-time graphics has tolerated small amounts of variance from various sources for years while enjoying wild success.