We coped by doing most of our testing with 4X antialiasing and 16X anisotropic filtering enabled at resolutions as high as most monitors would take us: 1600x1200. That approach gave us a decent idea what kind of comparative performance to expect from the GeForce 7800 GTX, but it didn't entirely harness the power of two of them in SLI, as we learned when we drove a pair of 'em with the brand-new Athlon 64 FX-57 after the fact.
Some of our readers requested a different approach, asking us to test the cards at ultra-high resolutions, including the big kahuna, 2048x1536. We're obliging sorts of fellows, so we set out to locate and purchase a monitor capable of running at this almost otherworldly screen resolution. A hundred bucks and a brush with a hernia later, an enormous 22" aperture grill monitor sits atop the Damage Labs test bench, causing it to bow slightly downward. (OK, so I had to place the monitor over two support legs, so the end of the table didn't snap off.)
Turns out the performance numbers for the GeForce 7800 GTX look spectacularso spectacular, in fact, that it prompted us to investigate a little further. What we found is that there are some very good reasons for the GeForce 7800 GTX's relatively strong performance at very high resolutions, reasons that go beyond the card's additional pixel shading power and memory bandwidth.
The resolution question
Seeing the 7800 GTX's unexpectedly large advantage at uber-high resolutions over other high-end cardsincluding the Radeon X850XT Platinum Edition, the GeForce 6800 Ultra, and two 6800 Ultras in SLItriggered a foggy memory deep in the dusty recesses of my head. I remembered hearing something about certain performance-enhancing features like HyperZ being disabled at higher resolutions on old ATI Radeons, so I fired off a couple of e-mails to ATI and NVIDIA asking whether such limitations could help explain the relatively weak high-res performance of the GeForce 6800 Ultra and Radeon X850 XT PE compared to the 7800 GTX.
First came an answer from NVIDIA's Nick Stam, who forthrightly but cryptically admitted that the NV40 GPU powering the GeForce 6800 Ultra isn't at its best at exceptionally high resolutions. His statement to me read like so:
The 6800 series chips were designed to provide excellent performance up to 1600x1200, and although they support higher 3D resolutions, they lack certain hardware features to deliver strong performance at 2048x1536. The 7800GTX was designed to deliver much higher raw performance at 2048x1536, and better performance scaling from 1600x1200 to 2048x1536.So I was indeed on to something, an architecture difference between the NV40 and G70 GPUs. But what exactly was the difference?
The answer I got from ATI's David Nalasco was a little more enlightening:
Besides the obvious stuff like fill rate and memory, the only thing I can think of that might limit performance at really high resolutions is the size of the Hierarchical Z buffer. Back before the Radeon 9800, we had to disable HiZ if the display size exceeded the buffer size. That could cause a pretty big drop at high resolutions in some cases, especially on some of the derivative products (since the buffer size was tied to the number of pipes). But newer chips can just do HiZ on whatever fits in the buffer, so you still get most of the performance benefit.Hierarchical Z is a fairly sophisticated performance optimization that attempts to skip over the drawing of polygons that won't show up onscreen in a final, rendered scene because they have other objects in front of them. Hierarchical Z attempts to rule out occluded objects by examining pixels in bunches rather than one by one, and it can be quite effective at cutting down on the amount of rendering work required to draw a scene.
We increased the size of the HiZ buffer with the X800 series to HD resolution, so I think it's something like 2 megapixels (assuming 1920x1080 or 1600x1200). At 2048x1536 you have a little over 3 megapixels, so even then around 2/3 of the pixels could benefit from HiZ. The performance impact of that will be app-dependant.
ATI's Radeon X800 chips have a Hierarchical Z buffer sized for optimal performance at about two megapixels. Beyond that, they fall back somewhat gracefully to performing Hierarchical Z culling on only a portion of the screen, and the GPU must render the rest of the pixels without the aid of this optimization. That's unfortunate, because at very high resolutions, the graphics chip is in most need of its pixel-pushing facilities. However, displays with resolutions beyond two megapixels aren't yet terribly common, so the impact of this limitation is probably not especially widespread.
Armed with more detail about the precise limitations of ATI's chips, I proceeded to pester NVIDIA for more information about the difference between the NV40 and the G70. They proved willing to reveal just as much information to me as ATI had, though no more:
There are many features and performance optimizations (like zcull) that on NV4x are optimized for 2 Mpixels, and on G70 are optimized for 3 Mpixels. Like ATI, we also can operate on part of the screen at resolutions beyond our design point (the level at which the chip was originally designed to perform optimally). For example, an NV40 running @ 2048x1536 might be around 60% efficient with some of its optimizations like zcull.So the NV40 and its derivatives in the GeForce 6 series are indeed limited to two-megapixel resolutions for Z culling. There are also "features and performance optimizations" on the NV40, beyond Z culling, that run out of steam at very high resolutions, although NVIDIA isn't willing to specify exactly what those features might be. NVIDIA aimed for a higher resolution design point for the G70, and sized its buffers (and perhaps caches) accordingly. That helps explain why the GeForce 7800 GTX shines so brightly at 2048x1536 compared to the competition.
Obviously, these issues could become more relevant as high-definition displays become more prominent. For reference, here's how some of the more common display resolutions stack up in terms of megapixels. (I'm assuming "mega" in this case means an even million.)
Like I said, this issue only affects very high resolutions. Even the quintessential HD resolution, 1920x1080, fits almost entirely inside of a two-megapixel limit.
Unless you own an incredibly nice display of some sort, the most important lesson to take away from these revelations has to do with the relevance of performance comparisons at ultra-high resolutions like 2048x1536. Although it might seem like a great idea to compare fill rate or pixel shader performance by pushing current games to insane resolutions, the benchmark numbers you're likely to see from those resolutions don't in any meaningful sense offer an indication of how these graphics cards might compare to one another when running a more shader- or fill rate-intensive application (like a future game) at more common resolutions.
With that said, knowing what we know now, it should be even more interesting to look at how the performance of these graphics cards scales at very high resolutions. Besides, I've already bought the monitor and run the numbers, so we might as well take a look, darnit.
|Here's another reason the GeForce GTX 970 is slower than the GTX 980||14|
|This might be why Windows 10 isn't called Windows 9||56|
|The Windows 10 Technical Preview is available now||37|
|ARM announces OS, server tools for the Internet of things||10|
|Borderlands 2 comes to SteamOS, and The Pre-Sequel will follow||15|
|Haswell duallie infiltrates Zotac Nano XS mini PC||7|
|Mozilla unveils $25 Matchstick HDMI dongle||15|
|Self-destruct sequence fractures the NAND in ultra-secure SSD||17|