Nvidia’s GeForce 9800 GX2 graphics card

I said just last week that GPUs are breeding like rabbits, and here we have another example of multi-GPU multiplication. The brand-spanking-new GeForce 9800 X2 combines a pair of G92 graphics processors onto one card for twice the goodness, like an incredibly geeky version of an old Double-Mint gum commercial.

Cramming a pair of GPUs into a single graphics card has a long and familiar pedigree, but the most recent example of such madness is AMD’s Radeon HD 3870 X2, which stole away Nvidia’s single-card performance crown by harnessing a duo of mid-range Radeon GPUs. The folks on the green team tend to take the heavyweight performance crown rather seriously, and the GeForce 9800 GX2 looks primed to recapture the title. We already know the G92 GPU is faster than any single graphics processor AMD has to offer. What happens when you double up on them via SLI-on-a-card? Let’s have a look.

Please welcome the deuce

Dressed all in black, the GeForce 9800 GX2 looks like it means business. That’s probably because it does. This beast packs two full-on G92 graphics processors, each running at 600MHz with a 1500MHz shader clock governing its 128 stream processors. Each GPU has its own 512MB pool of GDDR3 memory running at 1GHz (with a 2GHz effective data rate) on a 256-bit bus. For those of you in Rio Linda, that adds up to 1GB of total graphics memory and a whole lotta bandwidth. However, as in any SLI setup, memory isn’t shared between the two GPUs, so the effective memory size of the graphics subsystem is 512MB.

The G92 GPU may be familiar to you as the engine behind the incredibly popular GeForce 8800 GT, and you may therefore be tempted to size up the GX2 as the equivalent of two 8800 GT cards in SLI. But that would be selling GX2 short, since one of the G92 chip’s stream processor (SP) clusters is disabled on the 8800 GT, reducing its shader and texture filtering power. Instead, the GX2 is closer to a pair of GeForce 8800 GTS 512 cards with their GPUs clocked slightly slower.

Translation: this thing should be bitchin’ fast.

XFX’s rendition of the GeForce 9800 GX2

The Johnny-Cash-meets-Darth-Vader color scheme certainly works well for it. (And before you even ask, let’s settle this right away: head to head, Cash would defeat Vader, ten times out of ten. Thanks for playing.) Such color schemes tend to go well with quirky personalities, and the GX2 isn’t without its own quirks.

Among them, as you can see, is the fact that its two DVI ports have opposite orientations, which may lead to some confusion as you fumble around behind a PC trying to connect your monitor(s). Not only that, but only port number 1 is capable of displaying pre-boot output like BIOS menus, DOS utilities, or the like. Nvidia calls this port “bootable.” The second port will drive a display only once you have video drivers installed and are booted into a proper OS.

To the left of the DVI and HDMI ports in the picture above are a pair of LED indicators to further confuse and astound you. The lower blinkenlight turns green to indicate that all of the necessary power leads are connected to the GX2, while the upper one lights up blue to indicate which of the two GX2 cards in (ahem) a quad SLI setup owns the primary display port.

As you can see in the picture above, a black plastic shroud envelops the entire GX2, as if it were a Steve Jobs-style black turtleneck. The GX2’s full-coverage shroud furthers its image as a self-contained graphics powerhouse—and conceals its true, dual nature, as we’ll soon find out.

A few holes in the shroud do expose key connectors, though. This card requires both a six-pin aux PCIe power connector and an eight-pin one. Take note: plugging a six-pin connector into that eight-pin port isn’t sufficient, as it is for some Radeon cards. The GX2 requires a true eight-pin power lead. Unfortunately, space around this eight-pin plug is tight. Getting our PSU’s connector into the port took a little extra effort, and extracting it again took lots of extra effort. Nvidia claims the problem is that some PSUs don’t comply with the PCIe spec, but that’s little comfort. Cutting a slightly larger hole in the shroud would have prevented quite a few headaches.

Extra exposure below the shroud, though, doesn’t seem to be part of the program. For instance, just to the left of the six-pin power plug is an audio S/PDIF input, needed to feed audio to the GX2’s HDMI output port. This port was concealed by a rubber stopper on this XFX card out of the box.

The GX2’s SLI connector lurks under a plastic cover, as well, semi-ominously suggesting the potential for quad SLI. Those who remember the disappointing performance scaling of Nvidia’s previous quad SLI solution, based on the GeForce 7950 GX2, will be relieved to hear that the 9800 GX2 should be free from the three-buffer limit that the combination of Windows XP and DirectX 9 imposed back then. Nvidia says it intends to deliver four-way alternate frame rendering (AFR), a la AMD’s CrossFire X. That should allow for superior performance scaling with four GPUs, provided nothing else gets in the way.

Price, certainly, should be no object for the typical quad SLI buyer. XFX has priced its 9800 GX2 at $599 on Newegg, complete with a copy of Company of Heroes. That puts the GX2 nearly a hundred bucks above the price of a couple of GeForce 8800 GTS 512MB cards. In this case, you’re paying a premium for the GX2’s obvious virtues, including its single PCIe x16 connection, dual-slot profile, quad SLI potential, and the happy possibility of getting SLI-class performance on a non-Nvidia chipset.

Looking beneath Vader’s helmet

After several minutes of unscrewing, prying, and coaxing, I was able to remove the 9800 GX2’s metal shroud. Beneath it lies this contraption:

Like the 7950 GX2 before it, the 9800 GX2 is based on a dual-PCB design in which each board plays host to a GPU and its associated memory. Unlike the 7950 GX2, this new model has a single, beefy cooler sandwiched between the PCBs. This cooler directs hot air away from the card in two directions: out of the lower part of the expansion slot backplate and upwards, out of the top of the shroud.

Beneath the card, a ribbon cable provides a communications interconnect between the two PCBs.

And at the other end of the card, you can see the partially exposed blades of the blower.

Sadly, that’s where my disassembly of the GX2 stopped—at least for now. Too darned many screws, and that ribbon cable looks fragile, so I chickened out. Presumably, on the PCB that houses the “primary” GPU, you’ll also find a PCI Express switch chip similar to the nForce 200 chip that Nvidia used as glue in the nForce 780i chipset.

Before we move on to see how this puppy performs, I should mention one other trick it has up its sleeve: when paired with the right Nvidia chipset with integrated graphics, the GX2 is capable of participating in a HybridPower setup. The basic idea here is to save power when you’re not gaming by handing things over to the motherboard’s integrated GPU and powering down the GX2 entirely. Then, when it’s game time, you light the fire under the big dawg again for max performance. Unfortunately, we’ve not yet been able to test HybridPower, but we’ll keep an eye on it and try to give it a spin shortly.

Another thing we’ve not yet tested is quad SLI with dual GX2s. We can show you how a single GX2 performs today; we’ll have to follow up with the scary-fast quad stuff later.

Our testing methods

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core
2 Extreme X6800
2.93GHz
Core
2 Extreme X6800
2.93GHz
System
bus
1066MHz
(266MHz quad-pumped)
1066MHz
(266MHz quad-pumped)
Motherboard Gigabyte
GA-X38-DQ6
XFX
nForce 680i SLI
BIOS
revision
F7 P31
North
bridge
X38
MCH
nForce
680i SLI SPP
South
bridge
ICH9R nForce
680i SLI MCP
Chipset
drivers
INF
update 8.3.1.1009

Matrix Storage Manager 7.8

ForceWare
15.08
Memory
size
4GB
(4 DIMMs)
4GB
(4 DIMMs)
Memory
type
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
CAS
latency (CL)
4 4
RAS
to CAS delay (tRCD)
4 4
RAS
precharge (tRP)
4 4
Cycle
time (tRAS)
18 18
Command
rate
2T 2T
Audio Integrated
ICH9R/ALC889A

with RealTek 6.0.1.5497 drivers

Integrated
nForce 680i SLI/ALC850

with RealTek 6.0.1.5497 drivers

Graphics Diamond Radeon HD
3850 512MB PCIe

with Catalyst 8.2 drivers

Dual
GeForce
8800 GT 512MB PCIe

with ForceWare 169.28 drivers

Dual Radeon HD
3850 512MB PCIe

with Catalyst 8.2 drivers

Dual
Palit GeForce
9600 GT 512MB PCIe

with ForceWare 174.12 drivers


Radeon HD 3870 512MB PCIe

with Catalyst 8.2 drivers

Dual

Radeon HD 3870 512MB PCIe

with Catalyst 8.2 drivers



Radeon HD 3870 X2 1GB PCIe

with Catalyst 8.2 drivers


Dual Radeon HD 3870 X2 1GB PCIe

with Catalyst 8.3 drivers



Radeon HD 3870 X2 1GB PCIe
+

Radeon HD 3870 512MB PCIe

with Catalyst 8.3 drivers

Palit
GeForce
9600 GT 512MB PCIe

with ForceWare 174.12 drivers

GeForce
8800 GT 512MB PCIe

with ForceWare 169.28 drivers

EVGA
GeForce 8800 GTS 512MB PCIe

with ForceWare 169.28 drivers

GeForce
8800 Ultra 768MB PCIe

with ForceWare 169.28 drivers

GeForce
9800 GX2 1GB PCIe

with ForceWare 174.53 drivers

Hard
drive
WD
Caviar SE16 320GB SATA
OS Windows
Vista Ultimate
x86 Edition
OS
updates
KB936710, KB938194, KB938979,
KB940105, KB945149,
DirectX November 2007 Update

Please note that we tested the single and dual-GPU Radeon configs with the Catalyst 8.2 drivers, simply because we didn’t have enough time to re-test everything with Cat 8.3. The one exception is Crysis, where we tested single- and dual-GPU Radeons with AMD’s 8.451-2-080123a drivers, which include many of the same application-specific tweaks that the final Catalyst 8.3 drivers do.

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Our test systems were powered by PC Power & Cooling Silencer 750W power supply units. The Silencer 750W was a runaway Editor’s Choice winner in our epic 11-way power supply roundup, so it seemed like a fitting choice for our test rigs. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

G x 2 = Yow!

To give you a better sense of the kind of wallop this one graphics card packs, have a look at the theoretical numbers before. On paper, at least, the GX2 is staggering.

Peak
pixel
fill rate
(Gpixels/s)

Peak bilinear

texel
filtering
rate
(Gtexels/s)


Peak bilinear

FP16 texel
filtering
rate
(Gtexels/s)


Peak
memory
bandwidth
(GB/s)

Peak
shader
arithmetic
(GFLOPS)

GeForce 9600 GT

10.4 20.8 10.4 57.6 312
GeForce 8800 GT 9.6 33.6 16.8 57.6 504
GeForce 8800 GTS 512 10.4 41.6 20.8 62.1 624

GeForce 8800 GTX

13.8 18.4 18.4 86.4 518
GeForce 8800 Ultra

14.7 19.6 19.6 103.7 576
GeForce 9800 GX2

19.2 76.8 38.4 128.0 1152
Radeon HD 2900 XT

11.9 11.9 11.9 105.6 475
Radeon HD 3850 10.7 10.7 10.7 53.1 429
Radeon HD 3870 12.4 12.4 12.4 72.0 496
Radeon HD 3870 X2

26.4 26.4 26.4 115.2 1056

The GX2 outclasses Nvidia’s previous top card, the GeForce 8800 Ultra, in every category. More importantly, perhaps, it matches up well against the Radeon HD 3870 X2—ostensibly its closest competitor, although the 3870 X2 is now selling for as low as $419. The two cards are fairly evenly matched in terms of pixel fill rate, memory bandwidth, and peak shader power, but look closely at the 9800 GX2’s advantage over the 3780 X2 in terms of texture filtering capacity—it leads 76.8 to 26.4 Gtexels/s. The gap closes with FP16 texture formats, where the GX2’s filtering capacity is chopped in half, but it’s still considerable.

We can, of course, measure some of these things with some simple synthetic benchmarks. Here’s how the cards compare.

The 9800 GX2 trails the 3870 X2 in terms of pixel fill rate, but it makes up for it with a vengance by more than doubling the X2’s multitextured fill rate, more or less as expected. In fact, in this test, the GX2 shows more texture filtering capacity than three GeForce 8800 Ultras or four Radeon HD 3870s.

Somewhat surprisingly, the Radeon HD 3870 X2 takes three of the four 3DMark shader tests from the GX2. But will that matter in real games?

Call of Duty 4: Modern Warfare

We tested Call of Duty 4 by recording a custom demo of a multiplayer gaming session and playing it back using the game’s timedemo capability. Since these are high-end graphics configs we’re testing, we enabled 4X antialiasing and 16X anisotropic filtering and turned up the game’s texture and image quality settings to their limits.

We’ve chosen to test at 1680×1050, 1920×1200, and 2560×1600—resolutions of roughly two, three, and four megapixels—to see how performance scales. I’ve also tested at 1280×1024 with the lower-end graphics cards, since some of them struggled to deliver completely fluid rate rates at 1680×1050.

Here’s our first look at the GX2’s true performance, and it’s a revelation. This “single” graphics card utterly outclasses the GeForce 8800 Ultra and Radeon HD 3870 X2, producing performance faster than three Radeon HD 3870 GPUs in a CrossFire X team and nearly matching a pair of 3780 X2 cards with four GPUs.

The only fly in the ointment is a consistent problem we’ve seen in this game with SLI configs using 512MB cards; their performance drops quite a bit at 2560×1600. The GX2 looks to be affected, although not as badly as the GeForce 8800 GT SLI and 9600 GT SLI setups we tested. And, heck, it’s still pumping out 60 frames per second at that resolution.

Enemy Territory: Quake Wars

We tested this game with 4X antialiasing and 16X anisotropic filtering enabled, along with “high” settings for all of the game’s quality options except “Shader level” which was set to “Ultra.” We left the diffuse, bump, and specular texture quality settings at their default levels, though. Shadows, soft particles, and smooth foliage were enabled. Again, we used a custom timedemo recorded for use in this review.

I’ve excluded the three- and four-way CrossFire X configs here since they don’t support OpenGL-based games like this one.

The GX2 slices through Quake Wars with ease, again easily outperforming the Radeon HD 3870 X2 and the GeForce 8800 Ultra. The only place where that might really matter in this game is at 2560×1600 resolution. At lower resolutions, the X2 and Ultra are plenty fast, as well.

Half-Life 2: Episode Two

We used a custom-recorded timedemo for this game, as well. We tested Episode Two with the in-game image quality options cranked, with 4X AA and 16 anisotropic filtering. HDR lighting and motion blur were both enabled.

Here, the GX2 just trails not one but two Radeon HD 3870 X2s paired (or really quadded) up via CrossFire X. No other “single” card comes close. Since the GX2 averages 67 FPS at 2560×1600, even the two and three-way SLI configs with a GeForce 8800 Ultra don’t look to be much faster in any meaningful sense.

Crysis

I was a little dubious about the GPU benchmark Crytek supplies with Crysis after our experiences with it when testing three-way SLI. The scripted benchmark does a flyover that covers a lot of ground quickly and appears to stream in lots of data in a short period, possibly making it I/O bound—so I decided to see what I could learn by testing with FRAPS instead. I chose to test in the “Recovery” level, early in the game, using our standard FRAPS testing procedure (five sessions of 60 seconds each). The area where I tested included some forest, a village, a roadside, and some water—a good mix of the game’s usual environments.

Due to the fact that FRAPS testing is a time-intensive endeavor, I’ve tested the lower-end graphics cards at 1680×1050 and the higher end cards at 1920×1200, with the GX2 and our CrossFire X configs included in both groups.

The GX2’s performance at 1920×1200 is matched only by two or three GeForce 8800 Ultras. That is, in technical terms, totally sweet. Personally, despite the seemingly low numbers, I’d consider Crysis playable at 1920×1200 and high quality on the GX2. The card is producing a 35 FPS average, but it rarely dips below that average, so it feels smooth enough. And I’m testing in a very intensive section of the game with lots of dense jungle. When you move elsewhere on the same map, frame rates can climb into the mid 40s or better.

In order to better tease out the differences between the high-end solutions, I cranked up Crysis to its “very high” quality settings and turned on 4X antialiasing.

The GX2 stumbles badly, just as the Radeon HD 3870 X2 did. Although the GX2 is a very fast single card, it can’t match two GeForce 8800 Ultras in a number of key respects, including pixel fill rate and memory bandwidth. Perhaps more notably, the GX2 has, effectively, 512MB of memory, while even a single GeForce 8800 Ultra has 768MB. That may (and probably does) explain the Ultra’s higher performance here.

Unreal Tournament 3

We tested UT3 by playing a deathmatch against some bots and recording frame rates during 60-second gameplay sessions using FRAPS. This method has the advantage of duplicating real gameplay, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

Because UT3 doesn’t natively support multisampled antialiasing, we tested without AA. Instead, we just cranked up the resolution to 2560×1600 and turned up the game’s quality sliders to the max. I also disabled the game’s frame rate cap before testing.

Once more, the GX2 distances itself from the GeForce 8800 Ultra and Radeon HD 3870 X2. One almost forgets this is a “single” graphics card and starts comparing it to two Ultra or two X2s. I’d say that wouldn’t be fair, but heck, the GX2 stands up pretty well even on that basis.

Power consumption

We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.

The idle measurements were taken at the Windows Vista desktop with the Aero theme enabled. The cards were tested under load running UT3 at 2560×1600 resolution, using the same settings we did for performance testing.

Note that the SLI configs were, by necessity, tested on a different motherboard, as noted in our testing methods section.

Given its insane performance, the GX2’s power consumption is realy quite reasonable. It can’t come close to matching the admirably low idle power consumption of the Radeon HD 3870 X2; even the three-way CrossFire X system draws fewer watts at idle. When running a game, however, the GX2 draws less power than the X2. That adds up to a very nice performance-per-watt profile.

Noise levels

We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 12″ from the test system at a height even with the top of the video card. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including the stock Intel cooler we used to cool the CPU. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

Unfortunately—or, rather, quite fortunately—I wasn’t able to reliably measure noise levels for most of these systems at idle. Our test systems keep getting quieter with the addition of new power supply units and new motherboards with passive cooling and the like, as do the video cards themselves. I decided this time around that our test rigs at idle are too close to the sensitivity floor for our sound level meter, so I only measured noise levels under load.

The GX2’s acoustic profile is a little disappointing in the wake of the miraculously quiet dual-slot coolers on recent Nvidia reference designs. It’s not horribly loud or distracting, but it does make its presence known with the blower’s constant hiss. The card is much quieter at idle, but still audible and probably louder than most.

GPU temperatures

Per your requests, I’ve added GPU temperature readings to our results. I captured these using AMD’s Catalyst Control Center and Nvidia’s nTune Monitor, so we’re basically relying on the cards to report their temperatures properly. In the case of multi-GPU configs, I only got one number out of CCC. I used the highest of the numbers from the Nvidia monitoring app. These temperatures were recorded while running UT3 in a window.

The GX2 runs hot, but not much hotter than most high-end cards.

Conclusions

The GeForce 9800 GX2 is an absolute powerhouse, the fastest graphics card you can buy today. Even Crysis is playable on the GX2 at 1920×1200 with high-quality settings. Although it is part of a clear trend toward high-end multi-GPU solutions, it is also a testament to what having a capable high-end GPU will get you. The GX2 is oftentimes faster than three Radeon HD 3870 GPUs and sometimes faster than four. Nvidia’s superior building block in the G92 GPU makes such feats possible. At the same time, the SLI-on-a-stick dynamic works in the 9800 GX2’s favor. The GX2 absolutely whups Nvidia’s incumbent product in this price range, the GeForce 8800 Ultra. Ye olde G80 is still a formidable GPU, but it can’t keep up with two G92s. The G92 is based on a smaller fab process, so even with dual GPUs onboard, the GX2’s power consumption is similar to the Ultra’s. On top of that, the G92 GPU offers new features like built-in HD video decode acceleration that the Ultra lacks.

The only disappointment here is the GX2’s relatively noisy cooler, which isn’t terrible, but can’t match the muted acoustics of Nvidia’s other high-end graphics cards. Let’s hope this card is an exception to the rule because of its multi-GPU nature, not part of a trend toward louder coolers.

All of that said, the GX2 is still an SLI-based product, so it has its weaknesses. Multi-GPU schemes require driver updates in order to work optimally with newer games. This is an almost inescapable situation given today’s hardware and software realities. Worse yet, the GX2’s multi-monitor support is as poor as any SLI config’s. The card can accelerate 3D games on only one display, and one must switch manually between multi-GPU mode and multi-display mode via the control panel. This arrangement is light-years behind the seamless multi-monitor support in AMD’s drivers. Nvidia really needs to address this shortcoming if it expects products like the GX2 to become a staple of its lineup.

At $599, the GX2 is pricey, but Nvidia and its partners can ask such a premium for a product with no direct rivals. Obviously, this is a card for folks with very high resolution displays and, most likely, motherboards based on non-Nvidia chipsets. If you are willing to live with an nForce chipset, you could save some cash and get fairly similar performance by grabbing a couple of GeForce 8800 GT cards and running them in SLI. Two of those will run most games at 2560×1600 resolution reasonably well. I’d even suggest looking into a couple of GeForce 9600 GTs as an alternative, based on our test results, if I weren’t concerned about how their relatively weak shader power might play out in future games.

The big remaining question is how well the GX2 realizes its potential in quad-SLI configurations. We’ll find out soon and let you know.

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!