NVIDIA sets optimization auditing rules

In response to a string of questionable optimizations in its Detonator FX 44.03 drivers, NVIDIA has set three key criteria for evaluating new driver optimizations and pledged to audit candidate optimizations before they're approved for inclusion in future driver releases. The criteria, as defined by NVIDIA in a conference call this morning, are:
  • An optimization must produce the "correct" image.
  • An optimization must accelerate more than just a benchmark case.
  • An optimization cannot contain a pre-computed state (static clip planes, pre-computed geometry, etc...)
If you've been following the recent optimization/cheating controversy, you'll notice that a number of the Det FX 44.03's optimizations fail one or more of the above tests. Current optimizations that don't meet the above criteria will be removed in the next driver release, and future optimizations will be audited to ensure that they adhere to the new criteria.

As you can see, NVIDIA's new optimization auditing guidelines don't exclude application-specific optimizations or pixel shader instruction re-ordering as long as those optimizations don't break any of the other three rules. We'll have to wait until NVIDIA's next driver release to see how the new guidelines impact performance and image quality.

The principles look promising, but the devil will be in the details and in how closely the spirit of these principles is followed. There is still ambiguity in graphics APIs about how exactly some things should be handled, and NVIDIA has given us no hints about how it will handle questionable situations. We have no guidance on what constitutes proper trilinear filtering or the like, so we don't know exactly what we're getting with the "correct image."

More specifically and importantly, NVIDIA didn't address the issue of converting floating-point datatypes in pixel shader programs to 1) lower precision floating-point or, more likely, 2) 12-bit integer. The issue of pixel shader precision is likely to be a bigger point of contention going forward than texture filtering methods, especially because NV3x chips appear to be rather pokey with FP16 and FP32 datatypes.

Of course, all of these concerns exist in the context of a stated willingness to do per-application optimizations. So we've seen a little progress today, but the jury's still out on whether this is enough to restore users' trust.

Tip: You can use the A/Z keys to walk threads.
View options

This discussion is now closed.