So now what?
This first take on Nvidia's FCAT tools and the things they can measure is just a beginning, so I don't have many conclusions for you just yet. But this is an awfully good start to a new era of GPU benchmarking. We can now see into the early stages of the rendering pipeline with Fraps and then determine exactly what's happening at the other end of the pipe when visuals are delivered to the display with FCAT. We can correlate the two and see how much leeway there is between them. And we also have videos that allow us to review the resulting animation with frame-by-frame precision, to show us the exact impact of any spikes or anomalies in the numbers we're seeing. These are the best tools yet for understanding real-time graphics performance, and they offer the potential for lots of new insights.
The fact that Nvidia has decided to release analytical tools of this caliber to the general public is remarkable. Yes, the first results of those tools have detected some issues with its competition's products, but who knows what other problems we might uncover with them down the road? Nvidia is taking a risk here, and the fact it's willing to do so is incredibly cool.
Going forward, there's still tons of work to be done. For starters, we need to spend quite a bit more time understanding the problems of multi-GPU micro-stuttering, runt frames, and the like. The presence of these things in our benchmark results may not be all that noteworthy if overall performance is high enough. The stakes are pretty low when the GPUs are constantly slinging out new frames in 20 milliseconds or less. I've not been able to perceive a problem with micro-stuttering in cases like that, and I suspect those who claim to are seeing extreme cases or perhaps other issues entirely. Our next order of business will be putting multi-GPU teams under more stress to see how micro-stuttering affects truly low-frame-rate situations where animation smoothness is threatened. We have a start on this task, but we need to collect lots more data before we are ready to draw any conclusions. Stay tuned for more on that front. I'm curious to see what other folks who have these tools in their hands have discovered, too.
The FCAT analysis has shown us that Nvidia's frame metering tech for SLI does seem to work as advertised. Frame metering isn't necessarily a perfect solution, because it does insert some tiny delays into the rendering-and-display pipeline. Those delays may create timing discontinuities between the game simulation time—and thus frame content—and the display time. They also add a minuscule bit to the lag between user input and visual response. But then there's apparently a fair amount of low-stakes timing slop in PC graphics, as the gap between our Fraps and FCAT results (in everything but the Unreal-engine-based Borderlands 2) has demonstrated. The best thing we can say for frame metering is that it makes the Fraps and FCAT times for SLI solutions appear to correlate about like they do for single-GPU solutions. That's a really high-concept way of saying that it appears to work pretty well.
We do want to be careful to note that frame delivery as measured by FCAT is just one part of a larger picture. Truly fluid animation requires the regular delivery of frames whose contents are advancing at the same rate. What happens at the beginning of the pipeline needs to match what happens at the end. Relying on FCAT numbers alone will not tell that whole story; we'd just be measuring the effectiveness of frame metering techniques. We've come too far in the past couple of years in how we measure gaming performance to commit that error now.
Ideally, we'd like to see two things happen next. First, although FCAT's captures are nice to have and Nvidia's scripts provide a measure of automation, using these tools is a lot of work and generates huge amounts of data. It would be very helpful to have an API from the major GPU makers that exposes the true timing of the frame-buffer flips that happen at the display. I don't think we have anything like that now, or at least nothing that yields results as accurate as those produced by FCAT. With such an API, we could collect end-of-pipeline data much easier and use frame captures sparingly, for sanity checks and deeper analysis of images. Second, in a perfect world, game developers would expose an API that reveals the internal simulation timing of the game engine for each frame of animation. That would allow us to do away with grabbing the Present() time via Fraps and end any debate about the accuracy of those numbers. We'd then have the data we need to correlate with precision the beginning and ending of the pipeline and to analyze smoothness—or, well, for someone who's smarter than us about the tricky math of a rate-match problem and the perceptual thresholds for smooth animation to do so.Follow me on Twitter for shorter ramblings.
189 comments — Last by uartin at 4:00 AM on 04/05/13
|1. BIF - $340||2. Ryu Connor - $250||3. mbutrovich - $250|
|4. YetAnotherGeek2 - $200||5. End User - $150||6. Captain Ned - $100|
|7. Anonymous Gerbil - $100||8. Bill Door - $100||9. ericfulmer - $100|
|10. dkanter - $100|
|Nvidia's GeForce GTX 1050 and GTX 1050 Ti graphics cards unveiledThe everyman's Pascal cards arrive||49|
|Examining early DirectX 12 performance in Deus Ex: Mankind DividedWe take a preview build for a spin||85|
|Asus' ROG Strix GeForce GTX 1080 graphics card reviewed Pascal gets a chance to soar||69|
|AMD's Radeon RX 460 graphics card reviewedThe pint-sized Polaris breaks cover||107|
|AMD's Radeon RX 470 graphics card reviewedA lesser star that's just as bright||289|
|AMD reveals the full specs of the Radeon RX 460 and RX 470But pricing remains a mystery||88|
|Gigabyte's GeForce GTX 1080 Xtreme Gaming graphics card reviewedPower Xtreme||56|
|Nvidia's GeForce GTX 1070 graphics card reviewedPascal power at a nicer price||106|
|AMD drops prices on the Radeon RX 460 and RX 470||43|
|Reports: Radeon RX 470D is a budget Polaris card for China||9|
|Examining reports of slow write speeds on the 32GB iPhone 7||30|
|Cellular Insights dissects iPhone 7 Plus modem performance||11|
|Deals of the week: scads of high-performance storage and more||9|
|Tobii's Eye Tracker 4C knows where your head is||5|
|GeForce driver 375.57 is prepared for Titanfall 2||8|
|Phanteks Eclipse P400 gets a tempered glass option||0|
|Radeon 16.10.2 drivers add support for October's big games||10|
|A real "console monitor" would be 720p @ 30 Hz ;P||+63|