Is FCAT more accurate than Fraps for frame time measurements?

Here's a geeky question we got in response to one of our discussions in the latest episode of the podcast that deserves a solid answer. It has to do with our Inside the Second methods for measuring video game performance using frame times, as demonstrated in our Radeon R9 Fury review. Specifically, it refers to the software tool Fraps versus the FCAT tools that analyze video output.

TR reader TheRealSintel asks:

On the FRAPS/frametime discussion, I remember during the whole FCAT introduction that FRAPS was not ideal, I also heard some vendors performance can take a dive when FRAPS is enabled, etc.



I actually assumed the frametimes in each review were captured using FCAT instead of FRAPS.



When you guys introduce a new game to test, do you ever measure the difference between in-game reporting, FCAT and FRAPS?

I answered him in the comments, but I figure this answer is worth promoting to a blog entry. Here's my response:

There's a pretty widespread assumption at other sites that FCAT data is "better" since it comes from later in the frame production process, and some folks like to say Fraps is less "accurate" as a result. I dispute those notions. Fraps and FCAT are both accurate for what they measure; they just measure different points in the frame production process.

It's quite possible that Fraps data is a better indication of animation smoothness than FCAT data. For instance, a smooth line in an FCAT frame time distribution wouldn't lead to smooth animation if the game engine's internal simulation timing doesn't match well with how frames are being delivered to the display. The simulation's timing determines the *content* of the frames being produced, and you must match the sim timing to the display timing to produce optimally fluid animation. Even "perfect" delivery of the frames to the display will look awful if the visual information in those frames is out of sync.

What we do now for single-GPU reviews is use Fraps data (or in-engine data for a few games) and filter the Fraps results with a three-frame moving average. This filter accounts for the effects of the three-frame submission queue in Direct3D, which can allow games to tolerate some amount of "slop" in frame submission timing. With this filter applied, any big spikes you see in the frame time distribution are likely to carry through to the display and show up in FCAT data. In fact, this filtered Fraps data generally looks almost identical to FCAT results for single-GPU configs. I'm confident it's as good as FCAT data for single-GPU testing.

For multi-GPU configs, things become more complicated because frame metering/pacing comes into the picture. In that case, Fraps and FCAT may look rather different. That said, a smooth FCAT line with multi-GPU is not a guarantee of smooth animation alone. Frame metering only works well when the game advances its simulation time using a moving average or a fixed cadence. If the game just uses the wall clock for the current frame, then metering can be a detriment. And from what I gather, game engines vary on this point.

(Heck, the best behavior for game engine timing for SLI and CrossFire—advancing the timing using a moving average or fixed cadence—is probably the opposite of what you'd want to do for a variable-refresh display with G-Sync or FreeSync.)

That's why we've been generally wary of AFR-based multi-GPU and why we've provided video captures for some mGPU reviews. See here.

At the end of the day, a strong correlation between Fraps and FCAT data would be a better indication of smooth in-game animation than either indicator alone, but capturing that data and quantitatively correlating it is a pain in the rear and lot of work. No one seems to be doing that (yet?!).

Even further at the end of the day, all of the slop in the pipeline between the game's simulation and the final display is less of a big deal than you might think so long as the frame times are generally low. That's why we concentrate on frame times above all, and I'm happy to sample at the point in the process that Fraps does in order to measure frame-to-frame intervals.

I should also mention: I don't believe the presence of the Fraps overlay presents any more of a performance problem than the presence of the FCAT overlay when running a game. The two things work pretty much the same way, and years of experience with Fraps tells me its performance impact is minimal.

Here's hoping that answer helps. This is tricky stuff. There are also the very practical challenges involved in FCAT use, like the inability to handle single-tile 4K properly and the huge amount of data generated, that make it more trouble than it's worth for single-GPU testing. I think both tools have their place, as does the in-engine frame time info we get from games like BF4.

In fact, the ideal combination of game testing tools would be: 1) in-engine frame time recordings that reflect the game's simulation time combined with 2) a software API from the GPU makers that reflects the flip time for frames at the display. (The API would eliminate the need for fussy video capture hardware.) I might add: 3) a per-frame identification key that would let us track when the frames produced in the game engine are actually hitting the display, so we can correlate directly.

For what it's worth, I have asked the GPU makers for the API mentioned in item 2, but they'd have to agree on something in common in order for that idea to work. So far, nobody has made it a priority.

Comments closed
    • itachi
    • 4 years ago

    Cool, ended up reading the first 2 lines as I’m lazy, however there is a subject I would be more interested in, Input lag, Linustechtips did a video about that Gsync vs Freesync input lag and the conclusion was rather confusing, and some people claim the results are flawed, also I had to dig and ask about the whole “can have Vsync on and off with Gsync” thing I didn’t know you could/should or whatever use it. (people say need to cap the FPS at like 140) but now I can’t remember why.. is it to avoid the GPU to calculate extra frames than 144 ?

    I plan to buy Asus next screen the 1440 IPS 144hz one it’s why I’m curious.

    • sweatshopking
    • 4 years ago

    I DON’T KNOW ANYTHING ABOUT ANY OF THIS, BUT I KNOW ONE THING FOR SURE.
    EVERY SINGLE ONE OF YOU IS WRONG.

      • southrncomfortjm
      • 4 years ago

      I think you are right. Does that also make me wrong? And if I’m wrong, then wouldn’t you also be wrong, in which case I would be right?

    • Mr Bill
    • 4 years ago

    Somebody has to ask the dumb questions so here goes…
    Can the motherboard chipset / PCIe bus interaction with the video card be safely ignored in all this mix? I mean, there is no er, ATI optimization for AMD chipsets versus other chipsets, right?

    • Freon
    • 4 years ago

    Thanks for the article. It’s nice to have one place to link when it comes up in discussions about the subject rather than all the smaller points along the way made in the “inside the second” series.

    • marraco
    • 4 years ago

    There are important aspects of games that are not considered in reviews.

    For example, Far Cry 4, even when running at 80 FPS, feels like 20 or 30 fps. There is something wrong with that engine.

    Maybe it is the poor sensibility of mouse rotations. It moves in jerks.
    When the user rotates the camera, it does not moves in “continuous” mode, but in discrete jumps. The jumps are too large, and it produces the same effect than lower fps, even when fcat and fraps report higher than 60 fps.

    That’s a common problem in PC ports, but even native PC games suffer from poor mouse implementations.

      • Jigar
      • 4 years ago

      I had this issue, I shifted to SSD and the problem vanished.

        • marraco
        • 4 years ago

        I have a SSD. This has nothing to do with any SSD.

          • Ninjitsu
          • 4 years ago

          The problem you describe sounds like a texture loading issue, which may be related to SSDs and RAM, as in the case of Arkham Knight.

            • DoomGuy64
            • 4 years ago

            That would be true if the game dynamically loaded textures, but since most games run fine on HDD’s, it shouldn’t be the case. It would more likely be a ram / multitasking issue, which might be fixed by eliminating any background tasks with high cpu/ram usage.

            • marraco
            • 4 years ago

            No. I had that problem meanwhile standing on the same place. No texture loading at all.

            just rotating the view and moving at low speed, even inside buildings, although it is the worst on open areas with lots of trees.

            And my specs were:
            Samsung SSD EVO 250 Gb
            16Gb ram @1600 (and the game never uses more than 4 GB)
            GTX 970 4 gb

            I even had the problem when setting everything at LOW graphics quality at 1680×1050@120hz.

            It is a flaw in the engine, which, by the way, also suffers from poor FOV implementation, even after many patches, which frequently resets to low FOV after you enter a tower, or grab a stair, a ledge or walk over some areas, with the added insult of ignorant people claiming that you pirated the game because it also have another, different, FOV problems on pirated games.

            • Westbrook348
            • 4 years ago

            I had microstuttering with an i7-2600 and 7950 playing GTA4, which is an infamously flawed port. In a game in which you drive down the street, the evenly spaced light posts should not pass you at variable times. I definitely think it was a problem with the game engine (or drivers) not my hardware.

            • kuttan
            • 4 years ago

            Its because of unoptimized port.

            • Ninjitsu
            • 4 years ago

            Hmmm. Interesting. I played that game on a friend’s PC, he has a 3570K, a 2GB GTX 670, 8GB RAM (@1600), and runs it off a 7200 RPM HDD. I don’t remember issues like that (at 1080p, with high-ultra settings, tweaked as necessary).

            I don’t know, I don’t want to attempt a diagnosis without more data. But yeah grabbing ledges and climbing ladders seemed broken as compared to FC3 (which I own and have played)…but I think he [i<]was[/i<] running a pirated version so maybe that has something to with it. EDIT: You didn't mention your CPU?

            • kuttan
            • 4 years ago

            [quote<] The problem you describe sounds like a texture loading issue, which may be related to SSDs and RAM, as in the case of Arkham Knight. [/quote<] He is not talking about texture loading issue. Such a texture loading issue would results in momentary shuttering feel which is not the case he say. In FC4 you need relatively high fps for getting smooth graphics performance which I also felt like him. The game engine behaves just that way.

      • Mr Bill
      • 4 years ago

      “…but even native PC games suffer from poor mouse implementations.”

      I never learned to move comfortably in WOW using keyboard. Thus, tanking in WOW using a mouse was a bit of a pain for me. Often when changing directions forward to back for example, movement would hesitate for too long.

      • TwoEars
      • 4 years ago

      By contrast the Tomb Raider engine is a masterpiece, buttery smooth and even scales well with SLI.

      • auxy
      • 4 years ago

      I will confirm that Far Cry 4 looks like garbage in terms of motion smoothness. Frankly I’m tempted to do some testing of myself but I just don’t really care because I don’t like the game or the series or the company! (*´ω`*)

      The Evolution Engine used in Warframe looks really nice and smooth even at lower frame rates. It’s also not very demanding given how nice it looks. It makes a great test case to show off things like LightBoost 2D and G-Sync.

      • Melvar
      • 4 years ago

      Turn the mouse sensitivity down in the game and use a higher DPI mouse. If you double the DPI and halve the sensitivity you should get twice as many discrete jumps when turning the same amount.

    • jihadjoe
    • 4 years ago

    That graph is interesting because FRAPS shows quite a bit of variance, while the FCAT line is pretty smooth.

      • Ninjitsu
      • 4 years ago

      I wonder if FRAPS variance can be seen an indicator of driver/engine/CPU issues, and FCAT variance merely GPU/driver issues?

        • jihadjoe
        • 4 years ago

        Not sure. I was sort of expecting the FCAT line to be even worse, but the fact that it’s better means there’s some smoothing that happens after the point where FRAPS captures its data.

      • Damage
      • 4 years ago

      That is a pre-frame-pacing graph. Things got better with frame pacing added to AMD’s drivers:

      [url<]https://techreport.com/r.x/radeon-hd-7990/crysis3-7990p.gif[/url<]

    • memorylane
    • 4 years ago

    [quote<]It's quite possible that Fraps data is a better indication of animation smoothness than FCAT data. For instance, a smooth line in an FCAT frame time distribution wouldn't lead to smooth animation if the game engine's internal simulation timing doesn't match well with how frames are being delivered to the display. The simulation's timing determines the *content* of the frames being produced, and you must match the sim timing to the display timing to produce optimally fluid animation. Even "perfect" delivery of the frames to the display will look awful if the visual information in those frames is out of sync.[/quote<] Sorry I don't quite get you there Scott. If you're trying to match sim speed to display speed, then it's having V-sync or a variable display (within the range), but matching the two doesn't have anything to do with what FCAT and Fraps are trying to find though, since benchmarking we inherently don't care if they match. We're only interested in their max output and how each frame is delivered. In that regard, FCAT is always more accurate because it actually shows how each frame is been displayed on the monitor, because it's in the end that matters. At best Fraps can replicate what FCAT found for single GPU, but I don't see any situation Fraps can be a better indication of smoothness.

      • MathMan
      • 4 years ago

      Imagine a game engine that renders frames at highly random rate. One takes 5ms, the next 100ms, then 5ms again.
      Imagine that the game engine uses these times in its internal simulation engine.
      In that case, FRAPS will show those times: 5ms, 100ms, …

      Now imagine that the driver smooths out the frame delivery such that all frames are spaced 52.5ms apart.

      In that case, FCAT will be a flat line, but the thing will look terrible, since simulation engine and frame display are not in sync.

      What you need to be sure that all is well, is that FCAT and FRAPS correlate. One indicator is not always enough.

    • DPete27
    • 4 years ago

    One thing I’m wondering: With the advent of variable refresh monitors, how has the severity of even frame time delivery changed in the user perception of smooth animation? (if at all)

      • eofpi
      • 4 years ago

      It’s stopped the card from waiting for the next display update time to send it a frame. Instead, up to the limits of the display’s capabilities, it can send the next frame whenever it’s ready. That’s the good news.

      The bad news is, how the game engine handles its internal updates will depend on the engine, and I don’t know enough about game engines to say how much work it’ll take to make that play nicely with variable refresh tech.

    • TheRealSintel
    • 4 years ago

    Thanks for the recap, much more thorough than I bargained for. You see what these podcasts are good for 😉 It actually made me remember some of the simulation-time vs display-time discussions from 2 years ago.

    As you summarized, ideally we would measure at both points (FRAPS+FCAT) and correlate the frames, but that’s not feasible today. EDIT: this is actually assuming that FRAPS captures simulation-time, but even that is not a given from your answer. Whether or not output metering is beneficial thus depends on the game engine :s

    Given it’s been that long already, you’d hope that the benchmarking ‘state of the art’ would be a bit further along… Guess there’s still some more wassonizing to do.

    EDIT:
    Actually, given all this, *if* the display-intervals closely follow the simulation-intervals, how big of an issue is some variance *really*?

    I agree this can be a huge issue with traditional displays, but since the advent of VFR technologies (and only in that case), much of the stuttering/tearing caused by the display chain itself is removed. The question then remains, how sensitive is the human vision to this variance? What is a minimally acceptable variance for smooth animation? Intuitively, it will probably also relate to the speed of the object being animated.

      • homerdog
      • 4 years ago

      You’re definitely onto something, but it simply isn’t feasible for Scott et. al. to do this kind of analysis for every GPU review. Might be game for a feature article or something though.

      • Andrew Lauritzen
      • 4 years ago

      You can definitely make it better, but you can never make it “perfect”. All of the VFR solutions still effectively do some sort of guessing at the frame rate based on past data. To “perfectly” update the simulation you’d have to know at what time the frame *is going to be displayed in the future*, which is of course impossible.

      Thus there is always a need to wipe out variance here. In VR it’s such a crucial need that you simply must go to a fixed frame rate (vsync). VFR type solutions are pretty reasonable outside of VR, but they are not an excuse to punt on this stuff entirely.

        • TheRealSintel
        • 4 years ago

        Are you sure about that guess-work in VFR solutions? I thought that was only the case for frames *outside* the VFR window, and inside the window things progressed deterministically. At least that’s what I got out of pcper’s analysis at the time.

        I’m not saying variance is to be dismissed, only that in a VFR solution some variance will probably be much better tolerated than in a traditional fixed refresh solution.

        After digging up some old articles related to UI animations I read in the past ( [url<]https://weblogs.java.net/blog/chet/archive/2006/02/make_your_anima.html[/url<] especially section 2) , it seems it's impossible to quantify how much variance (or the minimum frame-time) is tolerable without relating it to what is being shown/animated on the screen.

          • Andrew Lauritzen
          • 4 years ago

          > Are you sure about that guess-work in VFR solutions? I thought that was only the case for frames *outside* the VFR window, and inside the window things progressed deterministically.

          Nah, it’s physically impossible to know when a frame is going to be presented in the future while generating the content for it. As long as we’re stuck with the current causal structure of space-time, you need to predict a bit 🙂

          > I’m not saying variance is to be dismissed, only that in a VFR solution some variance will probably be much better tolerated than in a traditional fixed refresh solution.

          Agreed, I’m just noting that while it better tolerates systematic variance (i.e. whole thing got a bit slower/faster), it doesn’t really do anything about spikes. We still need to measure, report and eliminate those.

    • DPete27
    • 4 years ago

    I thought this was discussed back when FCAT came out?
    Yeah, [url=https://techreport.com/review/24553/inside-the-second-with-nvidia-frame-capture-tools<]here we go[/url<] [Edit] nvm, that article was linked in this one...

    • YukaKun
    • 4 years ago

    Is there a metric on how depending on the CPU time it takes to give numbers affects measurements? I’m not saying guesstimates, but proper measurements?

    I say this, because science loves measuring with as little interference as possible. I think there is a point to measure the FPS’es outside of the computer as a whole and directly from the DP, VGA, HDMI or DVI port if it can be done. At least, whatever error margin we get, we have it outside of the measurement itself.

    Cheers!

    • eofpi
    • 4 years ago

    TR uses a rolling 3-frame average with D3D games to account for the D3D submission queue. The Mantle games report direct frame times, and that data is reported without further processing. I presume Mantle has no such submission queue.

    What does TR do with OpenGL games? Are any in the test rotation? Is OpenGL compatible with Fraps or FCAT?

      • Damage
      • 4 years ago

      Haven’t tested an OpenGL game for a long time. Rage was a candidate, but it was fixed at 60Hz with dynamic image quality adjustments, so it wasn’t easy to test.

      We don’t use the three-frame low-pass filter on Mantle games or on any data produced by a game itself. There might be a case for doing so in certain cases, but it would depend on what exactly the timestamp from the game engine actually measures.

        • Andrew Lauritzen
        • 4 years ago

        While I have no problem with using a small moving average, it’s slightly misleading to relate it precisely to the “submission queue” of the OS, or more accurately the swap chain. Obviously any game that was doing naïve timing would not see filtered results or anything – as Scott mentions in his reply.

        In terms of swap chains, those are part of the OS/windowing system so generally everything (Mantle, OpenGL, etc) all goes through the same path. There are some minor details and so on when we get to things like DirectX 12 and Windows 10, but to get into the nitty gritty of that stuff would require a much more careful examination of how *specific* games treat the swap chain, as there are a variety of modes and behaviors available even without throwing different OS/APIs into the mix.

        So yeah, I’d say Scott is being slightly charitable by using the moving average, but it’s not completely unreasonable and it seems to track well with qualitative impressions of game smoothness. I certainly wouldn’t want the filter to be any wider (>3 frames), but 3 is not going to drastically change the results too much, and it clears up the plots a bit. Definitely shouldn’t do any processing/filtering on data coming from game engines as that is what they themselves are using to generate the frames of course.

    • TwoEars
    • 4 years ago

    Tricky stuff indeed.

    We’re getting into the nitty gritty of it now.

    Options 3 is definitely doable but you’d need to write some software similar to fraps and you’d also need a way to capture the video (or at least register a change in number or color). Can be done but it’s would take some money and effort.

    Another option would be if nvidia built this ability into say the nvidia g-sync chip and you could pull the data from there. Who knows, maybe the data is already in there but hidden.

    Maybe it would be possible to modify or hack a scaler chip to register this and then you can get the data from there?

      • Mr Bill
      • 4 years ago

      It might be interesting to do an article / tour of the lab/workshop where NVIDIA or AMD measures their own progress when optimizing their chips for release.

    • chuckula
    • 4 years ago

    Hi Damage, great explanation and props to TheRealSintel for the question.

    I have one question about this part of your answer:
    [quote<]or instance, a smooth line in an FCAT frame time distribution wouldn't lead to smooth animation if the game engine's internal simulation timing doesn't match well with how frames are being delivered to the display. The simulation's timing determines the *content* of the frames being produced, and you must match the sim timing to the display timing to produce optimally fluid animation.[/quote<] I fully agree that this is an important factor when actually playing a game for the enjoyment of the game. I'm not saying your reviews should ignore that factor either. However, when we change the context from enjoyment of the game to analysis of GPU hardware itself, how much of a role does the GPU hardware/driver play in actually making sure the simulation's timing is accurate? Is that actually heavily GPU dependent or is it more CPU dependent? That's not to say that it's 100% GPU or CPU, but does one component dominate at that particular level?

      • Damage
      • 4 years ago

      The correspondence between the simulation timing and the game’s timing depends on everything that happens after the game feeds a bunch of frame commands to the GPU and before the frame is displayed. Direct3D and the GPU drivers do the work in between, and there’s also buffering and overlap between frames involved. As I understand it, this work can be both CPU- and GPU- bound at multiple stages in the process.

      The issue isn’t so much “Is the game’s timing accurate?” per se as it is: “Does the game engine’s timing match up with when the frame is being displayed?”

      The question of game timing vs. display timing is just one aspect of performance. It is the aspect that prevents us from saying any single number from Fraps or FCAT is “the right one,” but it is a smaller factor in the grand scheme than the problems caused by really long delays in frame production. Those delays can happen even when game -> display corresponds very closely, and they are the main culprit less-than-smooth animation.

      Edit: Also, maybe I’m messing a bunch of this stuff up. I am not a game developer, although I’ve talked to some about this stuff.

        • chuckula
        • 4 years ago

        Thanks!

      • Andrew Lauritzen
      • 4 years ago

      Spikes in *CPU* work in the driver are the biggest thing you need to watch out for here. If the driver suddenly decides to take ~30ms to move around some memory or something then you can get a really long frame time and many game engines will naively advance the simulation proportionally. Even if the pipeline absorbs the bubble adequately (and thus FCAT sees no issues), the damage has already been done.

      New APIs should help a fair bit with this stuff both by moving things off a single rate-limiting thread (and thus being super-sensitive to spikes on that thread), and by just generally lowering the amount of unpredictable driver behavior in general. More of the spikes are likely to come from the games themselves going forward which is a good thing as game developers can measure and reason about that a lot better. It’s not a magic bullet, but it’s one of the main immediate benefits of the new APIs, even if you’re not going to render a zillion tiny objects or something.

      (There are of course a few other places where things can go wrong including the OS scheduler/kernel itself, but those are sort of beyond the scope of this discussion.)

        • chuckula
        • 4 years ago

        Awesome! Thanks for the inside info!

    • BobbinThreadbare
    • 4 years ago

    Interesting stuff. Don’t really have anything to add other than good work.

      • ImSpartacus
      • 4 years ago

      Yeah, this is really the only major area where tr differentiates, so it’s good to see then still hitting it hard.

Pin It on Pinterest

Share This