The GeForce 8800 in SLI

WHEN WE REVIEWED the GeForce 8800, I said we’d test the GPU in an SLI configuration “as soon as we can.” I will admit that I’ve dabbled in CPUs a little too much, and our look at GeForce 8800 SLI has been delayed. However, I also wondered in that same review: “who needs two of these things right now?” That’s a pretty good question given the GeForce 8800 GTX’s astounding pixel-slinging performance, and something of a de-motivator for one considering looking into GeForce 8800 SLI.

But now I have seen the light. It’s wider than it is tall, modulated by a series of filters, and about 30″ from corner to corner. I’m talking, of course, about Dell’s 3007WFP LCD. We need not invent a reason for GeForce 8800 SLI since display makers have already invented a darn fine one. A four-megapixel monster like this one cries out for the fastest possible graphics subsystem to drive it, and the GeForce 8800 in SLI seems like a perfect match. We’ve had a bundle of fun testing the two together and exploring the world of uber-high-res widescreen gaming.

We’ve also dug a little deeper into GeForce 8800 antialiasing, to see how it compares to single- and multi-GPU antialiasing modes. Even the vaunted quad SLI makes an appearance to take on dual GeForce 8800 GTXs for extreme bragging rights supremacy. The power meter in Damage Labs has been spinning like a Hillary Clinton campaign staffer in the wake of the Obama announcement. Read on to see how we put all of that power to use.

G80 SLI: Powerful, yet unrefined
The GeForce 8800 is still a very new product, so running pair of them in SLI is a funny mix of extreme, err, extremeness and as-yet-untapped potential. First, let’s talk about the extremeness. We already know that a single GeForce 8800 GTX graphics card performs more or less on par with two of its fastest competitor, the Radeon X1950 XTX, running in tandem. With 680 million transistors, the G80 graphics processor has a formidable appetite for power and cooling. In GTX form, the GeForce 8800 has 128 stream processors running at 1.35GHz, a 384-bit path to its 768MB of memory, is 10.5″ long, and has two six-pin PCIe power connectors on a single card, like so:


The GeForce 8800 GTX’s twin power plugs

Running two of these cards together in a single system will require a grand total of four PCIe auxiliary power connectors, more than even most high-end power supplies can handle. We were able to get our test systems going in GTX SLI using our OCZ GameXStream 700W PSUs and a pair of four-pin Molex to PCIe adapters, but doing so ate up a connection on every Molex-equipped power lead on the PSU. We even had to share one lead with our DVD drive, a less-than-optimal solution—and this was on a test system with only one hard drive and no extra accessories. Those who are serious about building a system with dual GeForce 8800 GTX cards would do better to go with something like this one-kilowatt beast from BFG Tech.


Four PCIe connectors sprout from BFG’s 1kW power supply

The BFG Tech 1kW PSU comes with four six-pin PCIe leads out of the box, along with enough rated capacity to power five or six 100W light bulbs in addition to your PC. Unfortunately, like most 1kW PSUs, this BFG Tech one doesn’t have a larger, quieter 120mm cooling fan.

The GTX’s little brother, the GeForce 8800 GTS, doesn’t require such extreme measures, since it comes with only one PCIe plug per card. These cards are no longer than a Radeon X1950 XTX or a GeForce 7900 GTX, either. Still, with 96 stream processors clocked at 1.2GHz and 640MB of memory behind a 320-bit interface, the GTS isn’t exactly warm milk—more like Diet Mountain Dew: toned down, but still extreme.


A pair of BFG Tech GeForce 8800 GTS cards

Now, let’s talk about that untapped potential. Current GeForce 8800 drivers support dual-card SLI configurations, but no more than that. Yet every GeForce 8800 card comes with a pair of SLI connectors on board.


Dual connectors promise big things in the future

A dual-card GeForce 8800 SLI rig will only use one connector per card. In theory, the additional connectors could be used in offset fashion to create daisy-chained configurations of three, four, or more cards, once proper driver support is available. The fact that the G80 uses an external TMDS and RAMDAC chip to drive displays even suggests the possibility of cards a la the GeForce 7950 GX2 with dual GPUs and one display chip or even of “headless” GPU-only cards expressly intended for use with SLI. (Of course, daisy-chained cards would in all likelihood have to be equipped with single-slot coolers in order to fit into any standard-sized motherboard.) I expect the cards and drivers will materialize over time, but they’re probably not top priorities for the green team at present. They are, after all, winning the performance sweeps quite handily, and their G80 drivers remain a work in progress.

A more glaring omission is something I simply expected to see with GeForce 8800 SLI out of the box: SLI antialiasing. Both Nvidia’s SLI and ATI’s CrossFire can use multiple GPUs to achieve higher degrees of edge antialiasing than is possible with a single GPU. These antialiasing modes function as load-balancing methods that sacrifice raw performance for improved image quality. On the GeForce 7 series, SLI AA can deliver up to 16 samples with two GPUs and up to 32 samples in quad SLI. Similarly, CrossFire’s dual-GPU Super AA modes reach up to 12 samples. Surely, I thought, with G80’s nifty 16X coverage-sampled antialiasing, we’ll see SLI CSAA modes up to 32X in dual-GPU configurations. Turns out that’s not the case, at least not yet. Nvidia’s drivers haven’t yet enabled any SLI AA modes on the GeForce 8800. Support for SLI AA on the 8800 is in the works, but it’s not here yet, and we don’t have any ETA for it at present.

Fortunately, that’s not a major problem, given the G80 GPU’s excellent native AA support. We’ll compare SLI AA and SuperAA to GeForce 8800 antialiasing in the following pages, and you’ll see what I mean.

One display to rule them all

1600×1200 is a subset of the 3007WFP’s max resolution

Ok, so I resisted jumping on the mega-display bandwagon when ATI and Nvidia first started talking up “Extreme HD gaming” and the like. I first saw the Dell UltraSharp 3007WFP at CES last year, and I thought, “Hey, that’s gorgeous, but I’d rather have a used car for that kind of money,” and went on my merry way. I didn’t really consider the likes of this monitor or Apple’s big Cinema Displays as a live option for most folks, so grabbing one for video card testing seemed like a stretch. In other words, I’m really cheap.

I had other reasons to resist, too, including the fact that LCDs have traditionally been slower and had less color contrast than the best CRTs. Even more importantly, multiscan CRTs are capable of displaying multiple resolutions more or less optimally, while LCDs have a single, fixed native resolution.

However, LCD prices continue to plummet, and I recently found myself staring at a price of just under $1200 for a 3007WFP. That isn’t exactly cheap, but then we’re talking about reviewing a graphics subsystem that costs roughly the same amount. To mate it with a lesser display would be, well, unbalanced.

And the progress LCDs have made over CRTs has been tremendous lately. Not only have they made huge strides in color clarity and reached more-or-less acceptable response times, but they have also simply eclipsed CRTs in terms of resolution. In fact, every single common PC display resolution is a subset of the 3007WFP’s native 2560×1600 grid, so the 3007WFP can show lower resolutions in pixel-perfect 1:1 clarity as a box in the middle of the screen, if scaling is disabled.

That realization pretty much quelled any graphics-geek or display-purist objections I had to using an LCD for graphics testing, and I was at last dragged into the 21st century with the rest of the world—$1200 poorer, but with one really sweet monitor on the GPU test bench.

Make no mistake: the 3007WFP is worth every penny. It’s that good. Once you see it in person, sitting on your desk, it will force you to reevaluate your opinions on a host of issues, including the need for multiple monitors, the aspect ratio debate, the prospects for desktop computers versus laptops (’tis bright!), and whether your eight-megapixel digital photos are really sharp enough.

Moreover, the 3007WFP (or something like it) really is necessary to take advantage of two GeForce 8800 cards in SLI in the vast majority of current games. Prior to the 3007WFP, our max resolution for testing was 2048×1536 on a 21″ CRT. In our initial GeForce 8800 review, we found that the GeForce 8800 GTX could run the gorgeous and graphically intensive The Elder Scrolls IV: Oblivion at this resolution with the game’s Ultra High Quality presets, 4X AA with transparency supersampling, and high quality 16X anisotropic filtering at very acceptable frame rates. The GTX hit an average of 54 FPS and a low of 41 FPS, while the GTS averaged 37 FPS and bottomed out at 27 FPS The GTX even achieved a very playable 45 FPS with 16X CSAA added to the mix. Who needs a second GPU when a single graphics card can crank out visuals like that at playable frame rates? Going for 8800 SLI doesn’t make sense in the context of a more conventional display, at least not with today’s game titles.

Width Height Mpixels
640 480 0.3
720 480 0.3
1024 728 0.8
1280 720 0.9
1280 960 1.2
1280 1024 1.3
1400 1050 1.5
1680 1050 1.8
1600 1200 1.9
1920 1080 2.1
1920 1200 2.3
1920 1440 2.8
2048 1536 3.1
2560 1600 4.1

The 3007WFP’s 2560×1600 resolution, though, raises the ante substantially. The table to the left shows a number of common PC and TV display resolutions along with their pixel counts. As you can see, four megapixels is a class unto itself—well above the three megapixels of our previous max, 2048×1536, or the more common two-megapixel modes like 1600×1200 or 1920×1080. The screen’s 16:10 (or 8:5, if you’re picky) aspect ratio also mirrors the shape of things to come, with the growing popularity of wide-aspect displays in everything from laptops to desktops to HDTVs. In fact, our recent poll suggested 45% of our readers already use a wide-aspect 16:9 or 16:10 primary display with their PCs. Given that fact, I had hoped to conduct testing for this article in a trio of 16:10 resolutions: 1680×1050, 1920×1200, and 2560×1600. That would have given us resolutions of 1.8, 2.3, and 4.1 megapixels, all with the same aspect ratio. I have my doubts about whether the shape of the viewport will impact performance in any noticeable way, but I wanted to give it a shot. However, I ran into problems with both games and video drivers supporting wide-aspect resolutions consistently, so I had to revert to another option, testing at 1600×1200, 2048×1536, and 2560×1600. That gives us a tidy look at performance scaling at two, three, and four megapixels.

Insects crawl out from under the multi-GPU woodwork
While we’re on the subject of expensive things that are nice to have, we should take a moment to note some of the problems you’re buying into if you go this route. I can’t tell you how many times Nvidia and ATI have extolled the virtues of running “extreme HD gaming” involving high resolutions, wide aspect ratios, and multiple GPUs in the past couple of years. Too many to count. Yet when we took the plunge and went head-on into this world, we ran into unexpectedly frequent problems.

Of course, you’re probably already aware that multi-GPU support in games is spotty, because it typically requires a profile in the video card driver or special allowances in the game code itself. On top of that, not all games scale well with multiple GPUs, because performance scaling depends on whether the game is compatible with one of the various possible load-balancing methods. Nvidia and ATI have worked to encourage game developers to make their applications compatible with the best methods, such as alternate-frame rendering, but top new games still may not work well with SLI or CrossFire. This drawback is largely accepted as a given for multi-GPU configs today.

However, we ran into a number of other problems with wide-aspect multi-GPU gaming during our testing with SLI and CrossFire, including the following:

  • Nvidia’s control panel has settings to disable the scaling up of lower resolutions to fit the 3007WFP’s full res, but this option doesn’t “stick” on the latest GeForce 8800 drivers. Getting a native 1:1 display at lower resolutions on this monitor currently isn’t possible on the GeForce 8800 as a result. Nvidia confirmed for us that this is an unresolved bug. They are planning a fix, but it’s not available yet. This one isn’t an issue on the GeForce 7 or with the corresponding display settings in ATI’s Catalyst Control Center for the Radeon X1950.

  • Nvidia’s G80 drivers also offer no option for 1680×1050 display resolutions, either for the Windows desktop or for 3D games—another confirmed bug. I’m unsure whether this problem affects monitors with native 1680×1050 resolutions like the Dell UltraSharp 2007WFP, but it’s a jarring omission, nonetheless.

  • When picking various antialiasing modes via the Nvidia control panel on the GeForce 8800 with SLI, we found that Quake 4 crashed repeatedly. We had to reboot between mode changes in order avoid crashes.

  • With Nvidia’s latest official GeForce 7-series drivers, 93.71, we found that quad SLI’s best feature, the SLI 8X AA mode, did not work. The system appeared to be doing 2X antialiasing instead. We had to revert to the ForceWare 91.47 drivers to test quad SLI 8X AA.

  • At higher resolutions with 4X AA and 16X aniso filtering enabled, Radeon X1950 CrossFire doesn’t work properly in Oblivion. The screen appears washed out, and antialiasing is not applied. ATI confirmed this is a bug in the Catalyst 7.1 drivers. We tried dropping back to Catalyst 6.12 and 6.11, with similar results, and ATI then confirmed that this bug has been around for a while. We had to go back to Cat 6.10 in order to test in Oblivion.

  • This isn’t entirely the fault of the graphics chip makers, but in-game support for wide-aspect resolutions is still spotty. For instance, the PC version of Rainbow Six: Vegas lacks built-in wide-aspect resolution options, despite the fact that it’s a port from the HDTV-enabled Xbox 360. That may be one symptom of a larger problem, that R6: Vegas is a half-assed PC port, but the game comes with an ATI logo on the box. Similarly, Battlefield 2142 has no widescreen support, and ships with an Nvidia “The way it’s meant to be played” logo on its box. Apparently, the extreme HD gaming hype hasn’t yet seeped into these firms’ developer relations programs.

  • It’s still not possible to drive multiple monitors with SLI or CrossFire enabled. As I understand it, this is a software limitation.

  • Finally, Nvidia’s first official Windows Vista driver release will not include SLI support.

I also think Nvidia has taken a pronounced step backward with its new control panel application for a number of reasons, including the fact that they’re buried the option to turn on SLI’s load balancing indicators, but I will step down off of my soapbox now. The point remains that support for widescreen gaming at very high resolutions with multiple GPUs doesn’t appear to be a high priority for either Nvidia or ATI. Perhaps that’s understandable given the impending debut of Windows Vista and DirectX 10, which will require new drivers for all of their GPUs. Still, those who venture into this territory can expect to encounter problems. These issues are probably more plentiful with the GeForce 8800 because it’s still quite green. If you’re planning on laying out over $1100 on a pair of graphics cards, you might be expecting a rock-solid user experience. Based on what I’ve seen, you should expect some growing pains instead.

Now that I’ve wrecked any desire you had to build an SLI gaming rig, let’s repair that a little by looking at some performance numbers.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core 2 Extreme X6800 2.93GHz Core 2 Extreme X6800 2.93GHz
System bus 1066MHz (266MHz quad-pumped) 1066MHz (266MHz quad-pumped)
Motherboard Asus P5N32-SLI SE Deluxe Asus P5W DH Deluxe
BIOS revision 0305 0801
North bridge nForce4 SLI X16 Intel Edition 975X MCH
South bridge nForce4 MCP ICH7R
Chipset drivers ForceWare 6.86 INF Update 7.2.2.1007
Intel Matrix Storage Manager 5.5.0.1035
Memory size 2GB (2 DIMMs) 2GB (2 DIMMs)
Memory type Corsair TWIN2X2048-8500C5
DDR2 SDRAM
at 800MHz
Corsair TWIN2X2048-8500C5
DDR2 SDRAM
at 800MHz
CAS latency (CL) 4 4
RAS to CAS delay (tRCD) 4 4
RAS precharge (tRP) 4 4
Cycle time (tRAS) 15 15
Hard drive Maxtor DiamondMax 10 250GB SATA 150 Maxtor DiamondMax 10 250GB SATA 150
Audio Integrated nForce4/ALC850
with Realtek 5.10.0.6200 drivers
Integrated ICH7R/ALC882M
with Realtek 5.10.00.5345 drivers
Graphics GeForce 7900 GTX 512MB PCIe
with ForceWare 93.71 drivers
Radeon X1950 XTX 512MB PCIe + Radeon X1950 CrossFire
with Catalyst 7.1 drivers
Dual GeForce 7900 GTX 512MB PCIe
with ForceWare 93.71 drivers
Radeon X1950 XTX 512MB PCIe
with Catalyst 7.1 drivers
Dual GeForce 7950 GX2 1GB PCIe
with ForceWare 93.71 drivers
GeForce 8800 GTS 640MB PCIe
with ForceWare 97.92 drivers
Dual GeForce 8800 GTS 640MB PCIe
with ForceWare 97.92 drivers
GeForce 8800 GTX 768MB PCIe
with ForceWare 97.92 drivers
Dual GeForce 8800 GTX 768MB PCIe
with ForceWare 97.92 drivers
OS Windows XP Professional (32-bit)
OS updates Service Pack 2, DirectX 9.0c update (December 2006)

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Our test systems were powered by OCZ GameXStream 700W power supply units. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults.

The test systems’ Windows desktops were set at 1280×960 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Quantifying the insanity
Shader arithmetic is becoming ever more important as games take better advantage of GPU programmability, but basic pixel fill rates and texturing capabilities remain an important component of overall performance. Delivered performance in these categories is also very much tied to memory bandwidth, so we’ll take a look at that, as well. Here are the theoretical peak numbers for single graphics cards; optimally, these numbers would double in SLI or CrossFire, with perfect scaling.

Core
clock
(MHz)
Pixels/
clock
Peak
fill rate
(Mpixels/s)
Textures/
clock
Peak
fill rate
(Mtexels/s)
Effective
memory
clock (MHz)
Memory
bus width
(bits)
Peak memory
bandwidth
(GB/s)
GeForce 7900 GTX 650 16 10400 24 15600 1600 256 51.2
Radeon X1950 XTX 650 16 10400 16 10400 2000 256 64.0
GeForce 8800 GTS 500 20 10000 24 12000 1600 320 64.0
GeForce 7950 GX2 2 * 500 32 16000 48 24000 1200 2 * 256 76.8
GeForce 8800 GTX 575 24 13800 32 18400 1800 384 86.4

Note that the GeForce 7950 GX2 as listed already includes two G71 GPUs with their associated memory subsystems. You’ve got to then double those numbers for quad SLI configs. Nevertheless, the GeForce 8800 GTX has more memory bandwidth than the GX2, so dual GTXs in SLI will have more available memory bandwidth than a quad SLI rig. Yikes.

The GeForce 8800 GTS, meanwhile, doesn’t compare favorably to the GeForce 7900 GTX in terms of pixel and texel fill rates, but you might suspect that won’t be an issue when it comes time to run the latest games. Let’s see how well these single and multi-GPU configs deliver on their theoretical promise in a synthetic benchmark.

Both GeForce 8800 SLI systems come close to their theoretical peaks for multitextured fill rate here, and those are very high indeed. They’re not the highest of the bunch, though. The quad SLI rig is the fastest, and the GeForce 8800 GTS SLI setup trails the GeForce 7900 GTX in SLI. Trouble is, we’re about to go prove that doesn’t really matter.

Quake 4
In order to make sure we pushed the video cards as hard as possible, we enabled Quake 4’s multiprocessor support before testing. We used the game’s “playnettimedemo” to play back our gaming session with the game engine’s physics and game logic active.

Now we come to the boring part of our commentary, when there’s little left to do but stop and wonder. Ooh. Ahh!

Nearly all of the multi-GPU systems tested deliver playable frame rates at 2560×1600, and the single 8800 GTX’s 52.5 FPS is plenty fast, as well. For what it’s worth, the 8800 GTX in SLI clearly outpaces the quad SLI system based on dual 7950 GX2s, though both are more than fast enough. Typically, quad SLI is held back by the three-buffer limit in DirectX 9, but that’s not a problem in the OpenGL-based Quake 4. Even so, the 8800 GTXs in SLI in are faster.

F.E.A.R.
We’re using F.E.A.R.’s built-in “test settings” benchmark for a quick, repeatable comparison.

F.E.A.R. is a lot to handle at 2560×1600, and the GeForce 8800 GTX in SLI handles it best, followed by the quad SLI system. If you were running a GeForce 8800 GTS SLI setup, you’d probably want to drop back to 2X AA or to a lower resolution or lower quality setting in this game. CrossFire configs have long had trouble with F.E.AR., and the problems continue here. The Radeon X1950 XTX CrossFire system turns in a decent average frame rate, but its minimum frame rate is quite low—and exactly the same, at 14 FPS, as a single Radeon X1950 XTX.

Half-Life 2: Episode One
The Source game engine uses an integer data format for its high dynamic range rendering, which allows all of the cards here to combine HDR rendering with 4X antialiasing.

Both GeForce 8800 cards can run Half-Life 2: Episode One just fine without the assistance of a second GPU. The GTS averages over 60 frames per second with HDR lighting, 4X AA, and 16X aniso. SLI does scale up from there, but I’m not sure Alyx at 100Hz is any better than Alyx at 60Hz.

The Elder Scrolls IV: Oblivion
We tested Oblivion by manually playing through a specific point in the game five times while recording frame rates using the FRAPS utility. Each gameplay sequence lasted 60 seconds. This method has the advantage of simulating real gameplay quite closely, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered. We set Oblivion’s graphical settings to “Ultra High Quality.” The screen resolution was set to 2560×1600 resolution, with HDR lighting enabled. 16X anisotropic filtering and 4X AA was forced on via the cards’ driver control panels. Since the G71 GPU can’t do 16-bit floating-point texture filtering and blending in combination with antialiasing, it had to sit out these tests.

Again, the GeForce 8800 GTX doesn’t look to need any additional help in order to run a recent game, even one as good-looking as Oblivion, at 2560×1600 resolution. In order to further stress these configs, I turned up the quality levels on everything. All quality sliders in the game were completely maxed out. In the cards’ driver control panels, I enabled 8X antialiasing. For the GeForce 8800s, that means 8X coverage sampled AA, and for the Radeon X1950 XTX CrossFire, it means 8X Super AA. I also turned on transparency supersampling on the GeForce 8800s and the corresponding high-quality adaptive AA mode on the Radeons, and I set texture filtering to its highest available quality setting. The cards then performed like so:

Push hard enough, and even a GeForce 8800 GTX will start to show signs of stress. With these settings, SLI makes the difference between definitely smooth, playable frame rates and borderline ones. Although, honestly, I think the 8800 GTX feels quite playable in Oblivion at these settings, and if it’s a problem, turning down transparency supersampling gets frame rates up to more than acceptable levels. To give you an idea of the sort of visuals we’re talking about, here’s a screenshot of an outdoor area in Oblivion loaded with vegetation. I used all of the settings above, except with 16X CSAA to make edges look a little smoother. Dual 8800 GTXs in SLI handle this scene at over 30 FPS.


Oblivion at 2560×1600 with HDR lighting, 16X aniso, 16X CSAA, transparency supersampling, and HQ filtering.
Click for a full-sized version.

Rainbow Six: Vegas
This game is a new addition to our test suite, notable because it’s the first game we’ve tested based on Unreal Engine 3. This game doesn’t support 2560×1600 resolution out of the box, so we used the files available here to get it working at that res. Also, the game doesn’t have an SLI profile yet, but at Nvidia’s suggestion, I renamed the executable to “UTGame.exe” in order to get it working with SLI.

As with Oblivion, we tested with FRAPS. This time, I played through a 60-second portion of the “Border Town” map in the game’s Terrorist Hunt mode, with all of the game’s quality options cranked.

Note to self: when deciding test resolutions for lots of different configs, do not base the settings on what’s playable with the GeForce 8800 GTX in SLI. Man, playing this game on any of the other configs was painful, and it was practically impossible on the quad SLI system. Doh. Obviously, the GeForce 8800 GTX SLI system has everything else outclassed, so it’s able to run this game smoothly at 2560×1600 when nothing else can.

Ghost Recon Advanced Warfighter
We tested GRAW with FRAPS, as well. We cranked up all of the quality settings for this game, with the exception of antialiasing, since the game engine doesn’t take to AA very well.

Here, again, SLI raises frame rates just enough to assure fluid playability.

3DMark06

This will be a quick one, since I don’t have much to say about these results. They pretty much confirm what we’ve seen elsewhere.

The GeForce 8800 SLI systems seem to hit a limit in the simple vertex shader test at 307 FPS. Even though the individual 8800 GTS and GTX cards are faster than a single Radeon X1950 XTX, the CrossFire system comes out on top overall. In the complex vertex test, the 8800s in SLI reign supreme.

GeForce 8800 versus SLI AA and Super AA: the patterns
Antialiasing can get pretty complicated these days, but we can still sort through what the various GPUs are doing, by and large. The table below shows AA sampling patterns for the various modes used by each GPU or multi-GPU scheme. Samples for antialiasing routines are captured at a sub-pixel level, inside the area covered by a pixel. Basically, these patterns show from where inside of the pixel each of the samples are taken.

The multisampling AA routines in current GPUs rely on several different sample types, as well. The first of those is the sub-pixel color and depth (Z) information conferred by the sample point’s location in a polygon. The second is the polygon coverage information itself, and the third is the color information conferred by any textures or shader programs. Multisampling saves on bandwidth and fill rate by grabbing a smaller number of texture/shader samples per pixel than it does color/Z and coverage samples. The GPU then sorts out how to blend the texture and color information for the various sub-pixel fragments using the coverage information. For most of the GPUs involved here, the samples in the table below show only two sample types: texture/shader samples, which are green, and color/Z/coverage samples, which are pink.

The GeForce 8800’s coverage sampled AA is a further optimization of multisampling that discards a portion of the color/Z information that the GPU collects but preserves the information about which polygons cover the sample points. For instance, the 8X CSAA mode stores one texture/shader sample, four color/Z samples, and eight coverage samples. Doing so generally allows for a sufficient amount of color information, along with more accurate blending thanks to the additional coverage info. If you’re unfamiliar with CSAA, I suggest reading the section on it in my GeForce 8800 review. CSAA generally offers a very nice combination of performance and antialiasing quality for edge pixels.

The trick with the GeForce 8800 is figuring out where the coverage samples are located, since our usual FSAA Viewer tool doesn’t show them. Fortunately, we’ve managed to snag a tool that shows these coverage sample points, and I’ve overlaid them with our usual FSAA Viewer results in the table below for the GeForce 8800’s 8X, 8xQ, 16X, and 16xQ antialiasing modes. The smaller red dots in those patterns are the coverage sample locations.

I’ve also included results for the various multi-GPU antialiasing modes below. Many of them involve higher numbers of texture/shader samples, because they are the result of blending pixels from two different GPUs produced by their regular AA methods. Since the GeForce 8800 doesn’t yet have SLI AA support, it has to rely on its own native single-GPU AA modes.

GeForce
7900 GTX
GeForce
7900 GTX
SLI
GeForce
7950 GX2
SLI
GeForce
8800 GTX
Radeon
X1950 XTX
Radeon
X1950 XTX
CrossFire
2X

4X

6X

8x

8xS/8xQ
/8X/10X

12X

14X

16X

16xQ

32X

So we have lots of dots. What do they mean? First and foremost, I’d say we’ve learned that above 4X, the names given to the various modes—8X, 10X, and the like—don’t always mean the same things in terms of sample types and sizes. Comparing between them is difficult.

The new information here for us in the CSAA coverage sample patterns from the GeForce 8800, which are rather interesting. We can see immediately that the G80’s 8xQ mode is indeed simply a traditional 8X multisampling mode, where coverage information corresponds exactly to the location of color/Z samples.

The sample pattern for the CSAA 8X mode is notable, too. Four of the coverage samples correspond with the four color/Z samples, while the additional coverage samples are all located close to the pixel center, grouped around the texture/shader sample point. Nvidia seems to be betting that additional coverage information from the center of the pixel, near the texture/shader sample point, will produce the best results.

The 16X and 16xQ CSAA modes take a different approach, with no exact correspondence between color/Z sample points and coverage sample points. These two coverage sample patterns are both largely quasi-random, but they’re also different from one another. Nvidia claims these patterns were chosen intentionally to get the best results from the number of color/Z samples in each mode. In the 16X CSAA mode, four of the coverage samples come just the outside of the color/Z sample points. In the 16xQ mode, that’s not the case, but one of the coverage samples does appear to correspond with the texture/shader sample location.

Obviously, the older GPUs store more texture and color/Z samples than the GeForce 8800, especially in their multi-GPU AA modes. This fact should make them relatively slower—as should the fact that the multi-GPU AA modes aren’t particularly efficient methods of load balancing. The question is: how does the GeForce 8800 stack up in terms of image quality?


GeForce 8800 versus SLI AA and Super AA: image quality
Fortunately, the image quality question is fairly easy to answer by looking at some screenshots. The images below were taken from Half-Life 2 and blown up to exactly 4X their original size to make the individual pixels easier to see. I’ve chosen to focus on this little section of the screen because it shows us high-contrast edges at three different angles, all of which are relatively difficult cases for antialiasing.

In order to save space in the table below, I’ve taken some liberties with grouping the higher AA modes together. As we noted on the previous page, comparing between the modes exactly is difficult, so please indulge me. The groupings aren’t meant to suggest exact equivalency between the various AA modes.

Antialiasing quality
GeForce 7900 GTX SLI GeForce 7950 GX2 SLI Radeon X1950 XTX
CrossFire
GeForce 8800 GTX
No AA

2X

4X

6X

8xS 8xS SuperAA 8X 8X

SLI 8X SLI 8X SuperAA 10X 8xQ

Super AA 12X

SLI 16X SLI 16X Super AA 14X 16X

SLI 32X 16xQ

The relevant comparisons here are in the 8X and 16X modes, roughly speaking. You may see things differently, but I happen to think the GeForce 8800’s CSAA 8X mode matches up well against the SLI AA 8X and Super AA 8X modes, perhaps even comparing favorably to them. The additional color/Z samples in the G80’s 8xQ mode don’t seem to add much, if anything.

At 16X, the G80’s CSAA still holds up well against the Radeon’s 14X SuperAA and the GeForce 7 series’ SLI 16X modes, although the G71’s SLI 16X AA does look awfully good. Again, 16xQ doesn’t look much better than 16X, despite the additional color samples. I’d say quad SLI’s 32X mode produces the smoothest gradients of all, befitting its larger sample size.

Overall, then, the GeForce 8800’s coverage sampled AA manages to fend off two or even four previous-gen GPUs working in tandem, despite storing fewer color/Z samples. That’s very impressive, and as you might imagine, it leads to very good things when performance and image quality intersect.

GeForce 8800 versus SLI AA and Super AA: performance
Here’s how the various single and multi-GPU setups scale across their respective antialiasing modes. I’ve broken the results out into three separate graphs due to the difficulty of directly comparing the GPUs’ various AA modes. All of the graphs have the same scale, though, and we can draw some conclusions based on these performance results and the image quality info from the last page.

There’s very little performance hit associated with the CSAA 8X and 16X modes of the GeForce 8800, and given their image quality, I’d say they merit direct comparison to the multi-GPU 8X and 16X modes on the older cards. That gets kind of ugly, though:

The long and the short of it is that the G80’s excellent and very efficient coverage sampled AA puts it on top, even without special SLI AA modes in Nvidia’s current drivers.

Power consumption
We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. Out of necessity, we’re using a different motherboard for the CrossFire system, but for our power and noise tests, we tested the single Radeon X1950 XTX into the same motherboard as the rest of the single-card and SLI configs. Otherwise, the system components other than the video cards were kept the same.

The idle measurements were taken at the Windows desktop with SpeedStep power management enabled. The cards were tested under load running Oblivion using the game’s Ultra High Quality settings at 2560×1600 resolution with 16X anisotropic filtering. SpeedStep was disabled for the load tests.

The GeForce 8800 isn’t exactly easy on the juice, and SLI only exacerbates the situation. Still, the 8800 GTX SLI rig manages to draw less power under load than the Radeon X1950 XTX CrossFire system, remarkably enough. Noise levels and cooling
We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 14″ from the test system at a height even with the top of the video card. The meter was aimed at the very center of the test systems’ motherboards, so that no airflow from the CPU or video card coolers passed directly over the meter’s microphone. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including CPU and chipset fans. We had temperature-based fan speed controls enabled on the motherboard, just as we would in a working system. We think that’s a fair method of measuring, since (to give one example) running a pair of cards in SLI may cause the motherboard’s coolers to work harder. The motherboard we used for all single-card and SLI configurations was the Asus P5N32-SLI SE Deluxe, which on our open test bench required an auxiliary chipset cooler. The Asus P5W DH Deluxe motherboard we used for CrossFire testing, however, didn’t require a chipset cooler. In all cases, we used a Zalman CNPS9500 LED to cool the CPU.

Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a cards’ highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

We measured the coolers at idle on the Windows desktop and under load while running Oblivion at 2560×1600 with 16X aniso.

The GeForce 8800’s cooler remains very impressive for its ability to cool a 680 million transistor chip without making a racket. The picture doesn’t change too much in SLI, fortunately.

Conclusions
You really do need a four-megapixel display like the Dell 3007WFP in order to take full advantage of GeForce 8800 SLI with today’s games. Even then, a single GeForce 8800 GTX is often fast enough to drive a 2560×1600 monitor quite well without the aid of a second GPU—witness our test results in Quake 4, Half-Life 2: Episode One, and even Oblivion. Heck, even the G80’s CSAA 8X and 16X modes have so little performance penalty, one doesn’t really need SLI for them. The big exception is Rainbow Six: Vegas, which is brutal at 2560×1600 on everything but a pair of 8800 GTXs. As the first Unreal Engine 3 game we’ve tested, it may be an indicator of things to come, but I’m not quite sure. It may also just be a lousy port from the Xbox 360. That said, more intensive games are always coming, and there will likely be a reason to upgrade to a second GeForce 8800—even a second GTX—at some point in the next year or so. For now, though, you may want to keep a PCIe slot open and wait. That said, I’ve played through decent chunks of both Rainbow Six: Vegas and Oblivion with a pair of 8800 GTXs in SLI on the Dell 3007WFP, and it’s a glorious thing, having smooth-as-glass frame rates with incredible image quality on a massive, detailed display. If you have to means to treat yourself to such a setup, the visceral experience certainly won’t disappoint.

I wish I could say the same for the driver support, but Nvidia doesn’t yet have all of the wrinkles ironed out of GeForce 8800 SLI in concert with wide-aspect displays. I do expect most of the problems I’ve noted to be fixed eventually, but with Vista imminent, I fear those fixes may be on the backburner for longer than usual. We’ll have to see. When they come, they may be accompanied by all kinds of other new goodies for 8800 SLI, including things like CSAA 32X antialiasing and three- or four-GPU SLI daisy chaining. At that point, we’ll have to hope new games are out to harness that power properly. An eight-megapixel display would be well and good, but at this point, I’d definitely rather have better pixels than more of them.

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!