Single page Print

The FCAT tools — continued

Once you have the ability to capture each and every frame of animation streaming out of a video card at will, you're already well down the road to some interesting sorts of analysis. You can play back the sequence exactly as it looked the first time around, slow it down, speed it up, pause, and step through frame by frame. You can even correlate individual frames of animation to spikes recorded in Fraps and things like that. But what if you want to measure the timing of each and every GPU frame coming to the display?

For that purpose, Nvidia has developed an overlay program that inserts a colored bar along the left-hand side of each frame rendered by the GPU. These colors are inserted in a specific sequence of 16 distinct hues and serve as a sort of watermark, so each individual frame can be identified in sequence.

This gets complicated because, remember, with vsync disabled, the "frames" produced by the GPU don't correspond directly to the video "frames" displayed onscreen. In the example above, six GPU frames are spread across four display frames. The GPU is producing frames slightly faster than 60 FPS, or 60 Hz, during this span of time. The GPU frame marked "green" spans two video frames, occupying the bottom half of one and a small slice of the top of the next one, before the GPU switches to a new buffer with the aqua frame. And so on.

We used VirtualDub for the captures. If you simply capture this sort of output with the overlay enabled, you can page through individual video frames to get a clear sense of how frame delivery is happening. Very, very cool stuff. The next bit, though, is kind of magic.

The FCAT extractor tool scans through any input video with the overlay present and produces a CSV file with information about how many scan lines of each color are present in each frame of video. This file contains the raw data needed for all sorts of post-processing, including figuring out which GPU frames span multiple video frames and the like.

Interestingly enough, when we first tried the extractor tool with videos captured from a Radeon HD 7970, it didn't work quite properly. We asked Petersen about the problem, and he eventually found that the extractor was having trouble because the overlay colors being displayed by the Radeon weren't entirely correct. The extractor routine had to be adjusted for looser tolerances in order to account for the variance. That variance is mathematically very minor and not easily perceptible, but it is real. To the right is a pixel-doubled and heavily contrast-enhanced section of the (formerly) pink overlay output from the Radeon in Far Cry 3. You can probably see that there is some noise in it. The same pink overlay section from the GeForce GTX 680 is all the same exact color value. Not sure what that's worth or what the cause might be, but it's kind of intriguing.

After the overlay info has been extracted, the next step is to process it in various ways. Petersen has created a series of Perl scripts that handle that job. They can spit out all sorts of output, including a simple set of successive frame times that we can use just like Fraps data. The FCAT scripts include lots of options for processing and filtering the data, and since they're written in Perl, they can be modified easily. One thing they'll do is use Gnuplot to graph results. In fact, by default, the FCAT scripts produce two graphs that will look fairly familiar to TR readers: a frame time plot and a percentile curve.

Pardon the extreme compression, but the default plot size is ginormous, and I've not yet sorted out how to modify it. One nice thing the FCAT frame time plot does is correlate each frame time distribution to the scene time, something you won't see in our current Excel plots.

The percentile curves will look inverted if you're used to ours, because FCAT converts them into FPS terms. I know that option will be popular with some folks who still find the concept of FPS more intuitive to understand.

We haven't yet converted to using FCAT's visualization tools in place of our usual Excel sheets, but there is potential for automation here that extends well beyond what we currently have in place. If these tools are to be widely used in the industry—or, heck, even consistently used in several places—then automation of this sort will no doubt be needed. Processing this type of data isn't trivial; it's a long way from throwing together a few FPS averages.

Speaking of which, I should say that my summary of FCAT capture and analysis boils down a much more complex process. Configuring everything to work properly is a tedious affair that involves synchronizing EDIDs for the display and capture card behind the splitter, doing just the right magic to ensure good video captures without dropped or inserted frames, and a whole host of other things.

With that said, it's still extremely cool that Nvidia is enabling this sort of analysis of its products. The firm says its FCAT tools will be freely distributable and modifiable, and at least the Perl script portions will necessarily be open-source (since Perl is an interpreted language). Nvidia says it hopes portions of the FCAT suite, such as the colored overlay, will be incorporated into third-party applications. We'd like to see Fraps incorporate the overlay, since using it alongside the FCAT overlay is sometimes problematic.

Now, let's see what we can learn by making use of these tools.