The GeForce 9 series multi-GPU extravaganza

OK, I’ve been doing this since the last century, but I really have no idea how to frame this article. It might be a review of a new graphics card. Nvidia just recently introduced its GeForce 9800 GTX, and this is our first look at that card.

But we didn’t really stop there. We threw together two and then three 9800 GTX cards in order to see how they perform in some incredibly powerful and borderline ridiculous configurations. Then we totally crossed in the line into crazy-land by doubling up on GeForce 9800 GX2 cards and testing the new generation of quad SLI, as well. What’s more, we tested against the previous generation of three-way SLI—based on the GeForce 8800 Ultra—and against the competing CrossFire X scheme involving three and four Radeons.

Most of you probably care incrementally less about these configurations as the GPU count—and price tag—rises. But boy, do we ever have a hefty amount of info on the latest GPUs compiled in one place, and it’s a pretty good snapshot of the current state of things. Keep reading if you’re into that stuff.

GT to the X

A new video card doesn’t come along every day. Seriously. About three Sundays ago, not a single product announcement hit my inbox. Most days, however, it seems that at least one new variant of an already known quantity hits the streets with some kind of tweak in the clock speeds, cooling solutions, product bundles, or what have you.

Such is the case—despite the GeForce 9-series name—with the GeForce 9800 GTX. This card is, ostensibly, the replacement for the older GeForce 8800 GTX, but it’s very, very similar to the GeForce 8800 GTS 512 released in December—same G92 GPU, same 128 stream processors, same 256-bit memory path, same PCI Express 2.0 interface, same 512MB of GDDR3 memory. Even the cooler has a similarly angled fan enclosure, as you can see by glancing at our trio of 9800 GTX cards pictured below.

GeForce 9800 GTX cards from Palit, BFG Tech, and XFX

Not that there’s anything wrong with that. The G92 is a very capable GPU, and we liked the 8800 GTS 512. Just don’t expect earth-shaking miracles in the move to from GeForce series 8 to 9. In terms of specs, the most notable differences are some tweaked clock speeds. By default, the 9800 GTX ships with a 675MHz core, 1688MHz shader processors, and 1100MHz memory. That’s up slighty from the defaults of 650MHz, 1625MHz, and 970MHz on the 8800 GTS 512.

The GeForce 8800 GTS 512 (left) versus the 9800 GTX (right)

As is immediately obvious, however, the 9800 GTX departs from its younger sibling in some respects. Physically, the GTS 512 is only 9″ long and has but one six-pin aux power plug and one SLI connector onboard. The 9800 GTX is larger at 10.5″ long and has dual six-pin power connectors and two SLI “golden fingers” interfaces along its top.

These dual SLI connectors create the possibility of running a triumvirate of 9800 GTX cards together in a three-way SLI config for bone-jarring performance, and we’re set to explore that possibility shortly.

Another feature new to the 9800 GTX is support for Nvidia’s HybridPower scheme. The idea here is that when the GTX is mated with a compatible Nvidia chipset with integrated graphics, the discrete graphics card can be powered down for everyday desktop work, saving on power and noise. Fire up a game, though, and the GTX will come to life, taking over the 3D graphics rendering duties. We like the concept, but we haven’t yet seen it in action, since Nvidia has yet to release a HybridPower-capable chipset.

The three 9800 GTX cards we have gathered here today present strikingly similar propositions. Prices for the BFG Tech card start at $329.99 at online vendors. In typical BFG style, there’s no bundled game, but you do get a lifetime warranty. XFX ups the ante somewhat by throwing in a copy of Company of Heroes and pledging to support its card for a lifetime, plus through one resale, for the same starting price of 330 bucks. The caveat with both companies is that you must register your card within 30 days after purchase, or the warranty defaults to a one-year term. Palit, meanwhile, offers a two-year warranty and throws in a copy of Tomb Raider Anniversary, for the same price. All three cards share the same board design, which I understand is because Nvidia exercises strict control over its higher-end products. That’s a shame, because we really like the enhancements Palit built into its GeForce 9600 GT and wouldn’t mind seeing them in this class of card, as well.

Another attribute all three cards share is Nvidia’s stock clock speeds for the 9800 GTX. That will impact our performance comparison, because the EVGA GeForce 8800 GTS 512 card we tested came out of the gate with juiced up core and shader clocks of 670MHz and 1674MHz, respectively, which put it very close to the 9800 GTX. That’s not to say 9800 GTX cards are all wet noodles. Already, BFG Tech has announced GTX variants ranging up to 755MHz core, 1890MHz shader, and 1150MHz memory frequencies. I’d expect similar offerings from some of the other guys soon, too.

Stacked

Setting up a three-way SLI rig with GeForce 9800 GTX cards isn’t all that different than it is with GeForce 8880 Ultras. For the 9800 GTX, we chose to upgrade from an nForce 680i SLI-based motherboard to a 780i SLI mobo in order to gain support for PCI Express 2.0. As we’ve explained, the nForce 780i SLI’s PCIe 2.0 is a little odd since it uses a PCI Express bridge chip, but Nvidia claims it should be adequate. The most ideal configuration would probably be a board based on the nForce 790i SLI, but I didn’t have one of those handy.



You will need some specialized hardware in order to make a setup like this go. In addition to the motherboard and graphics cards, you’ll need a three-way SLI connector like the one pictured above, which you may have to order separately. This connector snaps into place atop all three cards, connecting them together. You’ll also need a power supply with six auxiliary PCIe power connectors and sufficient output to power the whole enchilada. I used a PC Power & Cooling Turbo-Cool 1200 that’s more than up to the task.

We used the same basic building blocks for our quad SLI test rig, but we swapped out the 9800 GTX cards for a pair of GeForce 9800 GX2s from Palit and XFX. We tested the XFX card in our initial GeForce 9800 GX2 review. As you may have learned from that article, each of these cards has two G92 GPUs on it. They’re also not cheap. The Palit card is currently going for $599 on Newegg, although there’s a $30 mail-in rebate attached, if you’re willing to jump through that particular flaming hoop.

Since the GX2 really packs ’em in, a quad SLI setup actually requires fewer power leads and occupies less slot space than a three-way config. Quad SLI also avoids that middle PCIe slot on all capable (nForce 680i, 780i, 790i) motherboards. That slot could be a bottleneck because its 16 lanes of first-gen PCIe connectivity hang off the chipset’s south bridge.

Of scaling—and failing

Multi-GPU schemes like SLI are always prone to fragility, and sometimes their performance simply doesn’t scale up well from one GPU to two, or from two to three or four. We’ve written about this many times before, most extensively in this section of our CrossFire X review, so I won’t cover that same ground once again. You’ll see this dynamic in effect in our performance results shortly.

We noticed some particular quirks of three- and four-way SLI in preparing this article, though, that bear mentioning here. One of those issues involves video memory. Very high performance graphics subsystems require lots of memory in order to work effectively at the extreme resolutions and quality levels they can achieve. That presents a problem for G92-based SLI because current GeForce 9800 GTX and GX2 cards come with 512MB of memory per GPU, and SLI itself eats up some video memory. In some cases, we found that the GeForce 8800 Ultra, with 768MB of RAM, performed better due to apparent video memory size limitations with G92-based SLI.

In addition, we used the 32-bit version of Windows Vista on our GPU test platforms, largely because Nvidia’s 64-bit drivers have sometimes lagged behind their 32-bit counterparts on features, optimizations, or QA validation. Since we’re often testing pre-release hardware with early drivers, that was a real problem. However, choosing a 32-bit OS in this day and age has its own perils. As you may know, 32-bit versions of Windows have a total memory address space of 4GB. We installed 4GB worth of DIMMs in our test systems, but some of the OS’s total address space is reserved by hardware devices, including video cards. For example, we see a total of 3.58GB of physical memory available in Task Manager when we have a single GPU installed on our Gigabyte X38-DQ6-based system.

This limitation hasn’t been much of a problem for us in the past, but the problem grew more acute on our nForce 780i SLI-based test system. With a single GPU installed, that system showed only 2.8GB of physical RAM installed. With dual GPUs, the total dropped to 2.5GB, and then to 2.3GB with three GPUs. Our quad SLI system saw that number dip to 1.79GB, which is, well, uncomfortable. I doubt it had much impact on our performance testing overall, but it’s not a lot of headroom and one heck of a waste of RAM. The lesson: use a 64-bit OS with these exotic three- and four-way SLI setups. We’ll be moving our test platforms to Vista x64 soon.

Behold the SLI stack

We ran into a couple of other little problems, as well. For one, screenshots captured on the Windows desktop with Aero enabled had big, black horizontal bands running across them when SLI was enabled with Nvidia’s latest 174.53 drivers. This is just a bug and not typical of the SLI setups we’ve seen in the past. Another likely bug was related to the “Do not scale” setting in Nvidia’s drivers that disables GPU scaling of lower resolutions up to the monitor’s native res. When we had that option enabled with three- and four-way SLI, 3D applications would start up with a black screen and the system would become unresponsive. We’d have to reboot the system to recover. Nvidia’s graphics drivers usually avoid such quirkiness, but right now, those two issues are very real.

Test notes

You can see all of our test configs below, but I’d like to make note of a few things. First, the GeForce 9600 GT card that we tested was “overclocked in the box” a little more fully than most (the core is 700MHz, while most cards are 650-675MHz), so its performance is a little bit higher than is typical. Similarly, we tested the GeForce 8800 GT and Radeon HD 3870 at their stock speeds, which are increasingly rare in this segment. Most shipping products have higher clocks these days.

Beyond that, we’re in pretty good shape. Our examples of the Radeon HD 3850 512MB and GeForce 8800 GTS 512MB are both clocked above the baseline frequencies by typical amounts, and most of the higher end cards tend to run close to their baseline clock speeds.

Our testing methods

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core
2 Extreme X6800
2.93GHz
Core
2 Extreme X6800
2.93GHz
Core
2 Extreme X6800
2.93GHz
System
bus
1066MHz
(266MHz quad-pumped)
1066MHz
(266MHz quad-pumped)
1066MHz
(266MHz quad-pumped)
Motherboard Gigabyte
GA-X38-DQ6
XFX
nForce 680i SLI
EVGA
nForce 780i SLI
BIOS
revision
F7 P31 P03
North
bridge
X38
MCH
nForce
680i SLI SPP
nForce
780i SLI SPP
South
bridge
ICH9R nForce
680i SLI MCP
nForce
780i SLI MCP
Chipset
drivers
INF
update 8.3.1.1009

Matrix Storage Manager 7.8

ForceWare
15.08
ForceWare
9.64
Memory
size
4GB
(4 DIMMs)
4GB
(4 DIMMs)
4GB
(4 DIMMs)
Memory
type
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
2
x Corsair
TWIN2X20488500C5
DDR2 SDRAM
at 800MHz
CAS
latency (CL)
4 4 4
RAS
to CAS delay (tRCD)
4 4 4
RAS
precharge (tRP)
4 4 4
Cycle
time (tRAS)
18 18 18
Command
rate
2T 2T 2T
Audio Integrated
ICH9R/ALC889A

with RealTek 6.0.1.5497 drivers

Integrated
nForce 680i SLI/ALC850

with RealTek 6.0.1.5497 drivers

Integrated
nForce 780i SLI/ALC885

with RealTek 6.0.1.5497 drivers

Graphics Diamond Radeon HD
3850 512MB PCIe

with Catalyst 8.2 drivers

Dual
GeForce
8800 GT 512MB PCIe

with ForceWare 169.28 drivers

Dual
GeForce
9800 GTX 512MB PCIe

with ForceWare 174.53 drivers

Dual Radeon HD
3850 512MB PCIe

with Catalyst 8.2 drivers

Dual
GeForce
8800 Ultra 768MB PCIe

with ForceWare 169.28 drivers

Triple
GeForce
9800 GTX 512MB PCIe

with ForceWare 174.53 drivers


Radeon HD 3870 512MB PCIe

with Catalyst 8.2 drivers

Triple
GeForce
8800 Ultra 768MB PCIe

with ForceWare 169.28 drivers

Dual
GeForce
9800 GX2 1GB PCIe

with ForceWare 174.53 drivers

Dual

Radeon HD 3870 512MB PCIe

with Catalyst 8.2 drivers

Dual
Palit GeForce
9600 GT 512MB PCIe

with ForceWare 174.12 drivers



Radeon HD 3870 X2 1GB PCIe

with Catalyst 8.2 drivers


Dual Radeon HD 3870 X2 1GB PCIe

with Catalyst 8.3 drivers



Radeon HD 3870 X2 1GB PCIe
+

Radeon HD 3870 512MB PCIe

with Catalyst 8.3 drivers

Palit
GeForce
9600 GT 512MB PCIe

with ForceWare 174.12 drivers

GeForce
8800 GT 512MB PCIe

with ForceWare 169.28 drivers

EVGA
GeForce 8800 GTS 512MB PCIe

with ForceWare 169.28 drivers

GeForce
8800 Ultra 768MB PCIe

with ForceWare 169.28 drivers

GeForce
9800 GX2 1GB PCIe

with ForceWare 174.53 drivers

GeForce
9800 GX2 1GB PCIe

with ForceWare 174.53 drivers

Hard
drive
WD
Caviar SE16 320GB SATA
OS Windows
Vista Ultimate
x86 Edition
OS
updates
KB936710, KB938194, KB938979,
KB940105, KB945149,
DirectX November 2007 Update

Please note that we tested the single and dual-GPU Radeon configs with the Catalyst 8.2 drivers, simply because we didn’t have enough time to re-test everything with Cat 8.3. The one exception is Crysis, where we tested single- and dual-GPU Radeons with AMD’s 8.451-2-080123a drivers, which include many of the same application-specific tweaks that the final Catalyst 8.3 drivers do.

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Most of our test systems were powered by PC Power & Cooling Silencer 750W power supply units. The Silencer 750W was a runaway Editor’s Choice winner in our epic 11-way power supply roundup, so it seemed like a fitting choice for our test rigs. The three- and four-way SLI systems required a larger PSU, so we used a PC Power & Cooling Turbo-Cool 1200 for those systems. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Doing the math

Nvidia’s G92 graphics processor is extremely potent, so the GeForce 9800 GTX is poised to become one of the fastest single-GPU graphics cards on the planet. When you start stacking those puppies up into multiples of two, three, and four, the theoretical peak throughput numbers start getting absolutely sick. Have a look.

Peak
pixel
fill rate
(Gpixels/s)

Peak bilinear

texel
filtering
rate
(Gtexels/s)


Peak bilinear

FP16 texel
filtering
rate
(Gtexels/s)


Peak
memory
bandwidth
(GB/s)

Peak
shader
arithmetic
(GFLOPS)

GeForce 8800 GT 9.6 33.6 16.8 57.6 336
GeForce 8800 GTS

10.0 12.0 12.0 64.0 230
GeForce 8800 GTS 512 10.4 41.6 20.8 62.1 416

GeForce 8800 GTX

13.8 18.4 18.4 86.4 346
GeForce 8800 Ultra

14.7 19.6 19.6 103.7 384
GeForce 8800 Ultra SLI (x2)

29.4 39.2 39.2 207.4 768
GeForce 8800 Ultra SLI (x3) 44.1 58.8 58.8 311.0 1152
GeForce 9800 GTX 10.8 43.2 21.6 70.4 432
GeForce 9800 GTX SLI (x2) 21.6 86.4 43.2 140.8 864
GeForce 9800 GTX SLI (x3) 32.4 129.6 64.8 211.2 1296
GeForce 9800 GX2 19.2 76.8 38.4 128.0 768
GeForce 9800 GX2 SLI (x4) 38.4 153.6 76.8 256.0 1536
Radeon HD 2900 XT

11.9 11.9 11.9 105.6 475
Radeon HD 3850 10.7 10.7 10.7 53.1 429
Radeon HD 3870 12.4 12.4 12.4 72.0 496
Radeon HD 3870 X2

26.4 26.4 26.4 115.2 1056
Radeon HD 3870 X2 + 3870 (x3)

37.2 37.2 37.2 172.8 1488
Radeon HD 3870 X2 CrossFire
(x4)

52.8 52.8 52.8 230.4 2112

You see, there are a lot of numbers there, and the become increasingly large as you add GPUs. Impressive.

One thing I should note: I’ve changed the FLOPS numbers for the GeForce cards compared to what I used in past reviews. I decided to use a more conservative method of counting FLOPS per clock, and doing so reduces theoretical GeForce FLOPS numbers by a third. I think that’s a more accurate way of counting for the typical case.

Even so, these numbers assume that each GPU can reach its theoretical peak and that we’ll see perfect multi-GPU scaling. Neither of those things ever really happens, even in synthetic benchmarks. They sometimes get close, though. Here’s how things measure in 3DMark’s synthetic feature tests.

Performance in the single-textured fill rate test tends to track more closely with memory bandwidth than with peak theoretical pixel fill rates, which are well beyond what the graphics cards achieve. The GeForce 9-series multi-GPU configs are absolute beasts in multitextured fill rate, both in theory and in this synthetic test.

In both tests, the GeForce 9800 GTX almost exactly matches the GeForce 8800 GTS 512. Those aren’t typos—just very similar results from very similar products.

Obviously, the SLI systems have scaling trouble in the simple vertex shader test. Beyond that, the results tend to fit pretty well with the expectations established by our revised FLOPS numbers. I wouldn’t put too much stock into them, though, as a predictor of game performance. We can measure that directly.

Call of Duty 4: Modern Warfare

We tested Call of Duty 4 by recording a custom demo of a multiplayer gaming session and playing it back using the game’s timedemo capability. Since these are high-end graphics configs we’re testing, we enabled 4X antialiasing and 16X anisotropic filtering and turned up the game’s texture and image quality settings to their limits.

We’ve chosen to test at 1680×1050, 1920×1200, and 2560×1600—resolutions of roughly two, three, and four megapixels—to see how performance scales. I’ve also tested at 1280×1024 with the lower-end graphics cards, since some of them struggled to deliver completely fluid rate rates at 1680×1050.

Apologies for the mess that is my GPU scaling line graph. All I can say is that I tried. You can see some of the trends pretty clearly, even with the mess.

The GeForce 9800 GTX tends to perform just a little bit better than the 8800 GTS 512, as one would expect give its marginally higher clock speeds. The result isn’t a big improvement, but it is sufficient to put the 9800 GTX in league with the GeForce 8800 Ultra, the fastest single-GPU card here.

AMD’s fastest card, the Radeon HD 3870 X2, is a dual-GPU job, and it’s quicker than the 9800 GTX. Costs more, too, so this is no upset.

Look to the results at 2560×1600 resolution to see where the multi-GPU configs really start to distinguish themselves. Here, two GeForce 9800 GTX cards prove to be faster than three Radeon HD 3870 GPUs and nearly as fast as four. However, the quad SLI rig gets upstaged by the three-way GeForce 8800 Ultra rig, whose superior memory bandwidth and memory size combine to give it the overall lead.

The frame rates we’re seeing here also give us a sense of proportion. Realistically, with average frame rates in the fifties, a single GeForce 9800 GTX will run CoD4 quite well at 1920×1200 with most image quality enhancements, like antialiasing and aniso filtering, enabled. You really only need multi-GPU action if you’re running at four-megapixel resolutions like 2560×1600, and even then, two GeForces should be plenty sufficient. The exception may be the GeForce 8800 GT and 9600 GT cards in SLI, whose performance tanks at 2560×1600. I believe they’re running out of video memory here, and newer drivers may fix that problem.

Enemy Territory: Quake Wars

We tested this game with 4X antialiasing and 16X anisotropic filtering enabled, along with “high” settings for all of the game’s quality options except “Shader level” which was set to “Ultra.” We left the diffuse, bump, and specular texture quality settings at their default levels, though. Shadows, soft particles, and smooth foliage were enabled. Again, we used a custom timedemo recorded for use in this review.

I’ve excluded the three- and four-way CrossFire X configs here since they don’t support OpenGL-based games like this one.

The GeForce 9800 GTX again performs as expected, just nudging ahead of the 8800 GTS 512. This time, it can’t keep pace with the GeForce 8800 Ultra, though.

Among the Nvidia multi-GPU systems, the three-way GeForce 8800 Ultra setup again stages an upset, edging out both the three- and four-way G92 SLI rigs at 2560×1600. Also, once more, we’re seeing frame rates of over 70 FPS with two 9800 GTX cards, raising the question of whether three or more G92 GPUs offer tangible benefits with today’s games.

Half-Life 2: Episode Two

We used a custom-recorded timedemo for this game, as well. We tested Episode Two with the in-game image quality options cranked, with 4X AA and 16 anisotropic filtering. HDR lighting and motion blur were both enabled.

Our single-GPU results for the 9800 GTX continue the cavalcade of precisely fulfilled expectations established by the 8800 GTS 512. Unfortunately, that doesn’t make for very good television.

The pricey multi-GPU solutions offer some flash and flair, though, with the G92 SLI configs finally showing signs of life. They scale better than the GeForce 8800 Ultra three-way setup, vaulting them into the 90 FPS range. Back on planet Earth, though, most folks would probably be perfectly happy with the performance of two GeForce 9600 GTs here, even at our top resolution.

Crysis

I was a little dubious about the GPU benchmark Crytek supplies with Crysis after our experiences with it when testing three-way SLI. The scripted benchmark does a flyover that covers a lot of ground quickly and appears to stream in lots of data in a short period, possibly making it I/O bound—so I decided to see what I could learn by testing with FRAPS instead. I chose to test in the “Recovery” level, early in the game, using our standard FRAPS testing procedure (five sessions of 60 seconds each). The area where I tested included some forest, a village, a roadside, and some water—a good mix of the game’s usual environments.

Due to the fact that FRAPS testing is a time-intensive endeavor, I’ve tested the lower-end graphics cards at 1680×1050 and the higher end cards at 1920×1200, with the G92 SLI and CrossFire X configs included in both groups.

This is one game where additional GPU power is definitely welcome, and the dual 9800 GTX and GX2 configs seem to be off to a good start at 1680×1050. Median lows in the mid-20s in our chosen test area, which seems to be an especially tough case, tend to add up to a pretty playable experience overall.

However, the performance of the three- and four-way G92 SLI configs begins to go wobbly at 1920×1200, where we’d expect them to get relatively stronger. Heck, the three-way 9800 GTX setup has trouble at 1680×1050, even—perhaps a sign that that third PCIe slot’s bandwidth is becoming a problem. Now look what happens when we turn up Crysis‘ quality options to “very high” and enable 4X antialiasing.

Ouch! All of the G92-based configs utterly flounder. The quad SLI rig simply refused to run the game with these settings, and the others were impossibly slow. Why? I believe what’s happening here is the G92-based cards are running out of video RAM. The GeForce 8800 Ultra, with its 768MB frame buffer, fares much better. So do the Radeons, quite likely because AMD’s doing a better job of memory management.

To be fair, I decided to test the G92-based configs at “very high” with antialiasing disabled, to see how SLI scaling would look without the video memory crunch. Here’s what I found.

Even here, three- and four-way SLI aren’t appreciably faster than two-way, and heck, quad SLI is still slower. You’re really just as well off with two GPUs.

Unreal Tournament 3

We tested UT3 by playing a deathmatch against some bots and recording frame rates during 60-second gameplay sessions using FRAPS. This method has the advantage of duplicating real gameplay, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

Because UT3 doesn’t natively support multisampled antialiasing, we tested without AA. Instead, we just cranked up the resolution to 2560×1600 and turned up the game’s quality sliders to the max. I also disabled the game’s frame rate cap before testing.

I probably shouldn’t even have included these results, but I had them, so what the heck. Truth be told, UT3 just doesn’t need that much of a graphics card to do its thing, especially since the game doesn’t natively support antialiasing. With the median low frame rates at almost 30 FPS on a GeForce 9600 GT, the rest of the results are pretty much academic. In both the Nvidia and AMD camps, the three-way multi-GPU configs consistently outpace the four-way ones here, as we’ve seen in other games at lower resolutions, when the CPU overhead of managing more GPUs dominates performance.

Power consumption

We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.

The idle measurements were taken at the Windows Vista desktop with the Aero theme enabled. The cards were tested under load running UT3 at 2560×1600 resolution, using the same settings we did for performance testing.

Note that the SLI configs were, by necessity, tested on different motherboards, as noted in our testing methods section. Also, the three- and four-way SLI systems were tested with a larger, 1200W PSU.

The GeForce 9800 GTX proves predictable again, drawing just a tad bit more power than the 8800 GTS 512. Shocking.

Notice how the 9800 GTX draws quite a bit less power, both at idle and under load, than the GeForce 8800 Ultra. That fact explains what we see with the multi-GPU configs: the G92-based options draw considerably less power than the GeForce 8800 Ultra-based ones. Oddly enough, the three- and four-way G92 SLI rigs draw almost exactly the same amount of power, both at idle and when loaded. The slightly lower core and memory clocks on the 9800 GX2, combined with the fact that only two PCIe slots are involved, may explain this result.

Noise levels

We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 12″ from the test system at a height even with the top of the video card. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including the stock Intel cooler we used to cool the CPU. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

Unfortunately—or, rather, quite fortunately—I wasn’t able to reliably measure noise levels for most of these systems at idle. Our test systems keep getting quieter with the addition of new power supply units and new motherboards with passive cooling and the like, as do the video cards themselves. I decided this time around that our test rigs at idle are too close to the sensitivity floor for our sound level meter, so I only measured noise levels under load.

The cooler on the 9800 GTX isn’t loud by any means, but it does have a bit of a high-pitched whine to it, and that shows up in our sound level meter readings. Nvidia may be taking a bit of a step back here with its reference coolers. The ones on the GeForce 8800 series were supremely quiet, and these newer coolers aren’t quite as good. That’s a shame.

Clustered at the bottom of the graph are the cards that required the 1200W power supply. That puppy, pardon my French, is freaking loud. Even at idle, where the 9800 GTX three-way and GX2 four-way configs both registered over 50 dB, as did the three-way Ultra rig. Under load, we’re off to the symphony. Then again, did you really expect a quad SLI rig pulling over 500W at the wall socket to be quiet? Even a relatively quiet PSU would crank up its cooling fan when feeding a 500W system.

Conclusions

For most intents, Nvidia’s G92 is currently the finest graphics processor on the planet. That fact shined through as we tested the GeForce 9800 GTX as a single graphics card, and it consistently performed—as expected—better than almost any single-GPU card available. I say “almost” because the older GeForce 8800 Ultra outran it at times, but I say “available” tentatively, because the Ultra is beginning to look scarce these days. The Ultra has its drawbacks, too. It doesn’t support all of the latest features for HD movie playback, such as HDCP over dual-link DVI or H.264 decode acceleration. It draws a lot of power. And at its best, the 8800 Ultra cost about twice what the 9800 GTX now does. If you’re looking for a high-end graphics card and don’t want to go the multi-GPU route, the GeForce 9800 GTX is the way to go.

The G92 GPU’s sheer potency creates a problem for Nvidia, though, when it becomes the building block for three- and four-way multi-GPU solutions. We saw iffy scaling with these configs in much of our testing, but I don’t really blame Nvidia or its technology. The truth is that today’s games, displays, and CPUs aren’t yet ready to take advantage of the GPU power they’re offering in these ultra-exclusive high-end configurations. For the most part, we tested with quality settings about as good as they get. (I suppose we could have tested with 16X CSAA enabled or the like, but we know from experience that brings a fairly modest increase in visual fidelity along with a modest performance penalty.) In nearly every case, dual G92s proved to be more than adequate at 2560×1600. We didn’t have this same problem when we tested CrossFire X. AMD’s work on performance optimizations deserves to be lauded, but one of the reasons CrossFire X scales relatively well is that the RV670 GPU is a slower building block. Two G92 GPUs consistently perform as well as three or four RV670s, and they therefore run into a whole different set of scaling problems as the GPU count rises.

Crysis performance remains something of an exception and an enigma. We now know several things about it. As we learned here, the game doesn’t really seem to benefit from going from two CPU cores to four. We know that two G92s in SLI can run the game pretty well at 1920×1200 using its “high” quality settings. “Very high” is a bit of a stretch, and three- and four-way SLI don’t appear to be any help. We also know this game will bump up against memory size limits with 512MB of video memory, especially with GeForce cards and current drivers. Crysis would seem to be the one opportunity for the G92 SLI configs to really show what they can do, but instead, it’s the older GeForce 8800 Ultra with 768MB of memory that ends up stealing the show. The Ultra also proves to be faster in CoD4 and Quake Wars at 2560×1600, thanks to its larger memory size and higher memory bandwidth.

So G92-based three- and four-way SLI remains a solution waiting for a problem—one that doesn’t involve memory size or bandwidth limitations, it would seem.

Personally, I’m quite happy to see the pendulum swing this way, for GPU power to outdistance what game developers are throwing at it. Now, a $199 card like the GeForce 9600 GT can deliver a very nice experience for most folks. If you want more power, the 9800 GTX is a solid option that doesn’t involve the compatibility issues that SLI and CrossFire bring with them. Yet I also can’t wait to see what these sort of high-end solutions could do when put to full and proper use. Unfortunately, we may have to wait until the current wave of console ports and cross-developed games passes before we find out.

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!