NVIDIA’s GeForce 6800 Ultra GPU

BY NOW, graphics aficionados are very familiar with the story of NVIDIA’s rise to the top in PC graphics, followed by a surprising stumble with its last generation of products, the NV30 series. NV30 was late to market and, when it arrived, couldn’t keep pace with ATI’s impressive Radeon 9700 chip. Worse, the GeForce FX 5800 had a Dustbuster-like cooling appendage strapped to its side that was loud enough to elicit mockery from even jaded overclocking enthusiasts. NVIDIA followed up the FX 5800 with a series of NV30-derived GPUs that got relatively better over time compared to the competition, but only because NVIDIA threw lots of money into new chip revisions, compiler technology, and significantly faster memory than ATI used.

Even so, the GeForce FX series was relatively slow at pixel shader programs and was behind the curve in some respects. ATI’s R3x0-series chips had more pixel pipelines, better antialiasing, and didn’t require as much tuning in order to achieve optimal performance. More importantly, NVIDIA itself had lost some of its luster as the would-be Intel of the graphics world. The company clearly didn’t enjoy being in second place, and sometimes became evasive or combative about its technology and the issues surrounding it.

But as of today, that’s all ancient history. NVIDIA is back with a new chip, the NV40, produced by a new, crystal-clear set of design principles. The first NV40-based product is the GeForce 6800 Ultra. I’ve been playing with one for the past few days here in Damage Labs, and I’m pleased to report that it’s really, really good. For a better understanding of how and why, let’s look at some of the basic design principles that guided NV40 development.

  • Massive parallelism — Processing graphics is about drawing pixels, an inherently paralleliziable task. The NV40 is has sixteen real, honest-to-goodness pixel pipelines—”no funny business,” as the company put it in one briefing. By contrast, NV30 and its high-end derivatives had a four-pipe design with two texture units per pipe that could, in special cases involving Z and stencil operations, process eight pixels per clock. The NV40 has sixteen pixel pipes with one texture unit per pipe, and in special cases, it can produce 32 pixels per clock. To feed these pipes, the NV40 has six vertex shader units, as well.


    An overview of the NV40 architecture. Source: NVIDIA

    All told, NV40 weighs in at 222 million transistors, roughly double the count of an ATI Radeon 9800 GPU and well more than even the largest desktop microprocessor. To give you some context, the most complex desktop CPU is Intel’s Pentium 4 Prescott at “only” 125 million transistors. Somewhat surprisingly, the NV40 chip is fabricated by IBM on a 0.13-micron fabrication process, not by traditional NVIDIA partner TSMC.

    By going with a 0.13-micron fab process and sixteen pipes, NVIDIA is obviously banking on its chip architecture, not advances in manufacturing techniques and higher clock speeds, to provide next-generation performance.

  • Scalability — With sixteen parallel pixel pipes comes scalability, and NVIDIA intends to exploit this characteristic of NV40 by developing a top-to-bottom lineup of products derived from this high-end GPU. They will all share the same features and differ primarily in performance. You can guess how: the lower end products will have fewer pixel pipes and fewer vertex shader units.

    Contrast that plan with the reality of NV3x, which NVIDIA admits was difficult to scale from top to bottom. The high-end GeForce FX chips had four pixel pipes with two texture units each—a 4×2 design—while the mid-range chips were a 4×1 design. Even more oddly, the low-end GeForce FX 5200 was rumored to be an amalgamation of NV3x pixel shaders and fixed-function GeForce2-class technology.

    NVIDIA has disavowed the “cascading architectures” approach where older technology generations trickle down to fill the lower rungs of the product line. Developers should soon be able to write applications and games with confidence that the latest features will be supported, in a meaningful way, with decent performance, on a low-end video card.


    A single, superscalar pixel shader unit. Source: NVIDIA
  • More general computational power — The NV40 is a more capable general-purpose computing engine than any graphics chip that came before it. The chip supports pixel shader and vertex shader versions 3.0, as defined in Microsoft’s DirectX 9 spec, with support for long instruction programs, looping, branching, and dynamic flow control. Also, NV40 can process data internally with 32 bits of floating-point precision per color channel (red, green, blue, and alpha) with no performance penalty. Combined with the other features of 3.0 shaders, this additional precision should allow developers to employ more advanced rendering techniques with fewer compromises and workarounds.
  • More performance per unit of transistors — Although GPUs are gaining more general programmability, this trend stands in tension with the usual mission of graphics chips, which has been to accelerate graphics functions through custom logic. NVIDIA has attempted to strike a better balance in NV40 between general computing power and custom graphics logic, with the aim of achieving more efficiency and higher overall performance. As a result, NV40’s various functional units are quite flexible, but judiciously include logic to accelerate common graphics functions.

By following these principles, NVIDIA has produced a chip with much higher performance limits than the previous generation of products. Compared to the GeForce FX 5950 Ultra, NVIDIA says the NV40 has two times the geometry processing power, four to eight times the 32-bit floating-point pixel shading power, and four times the occlusion culling performance. The company modestly says this is the biggest single performance leap between product generations in its history. For those of us who are old enough to remember the jump from the Riva 128 to the TNT, or even from the GeForce3 to the GeForce4 Ti, that’s quite a claim to be making. Let’s see if they can back it up.

 

The card
The GeForce 6800 Ultra reference card is an AGP-native design; there is no bridge chip. Rumor has it a PCI Express-native version of NV40 will be coming when PCI-E motherboards arrive, but this first spin of the chip will fit into current mobos without any extra help.

This card is also quite sight to behold, with yet another custom NVIDIA cooler onboard, and two Molex connectors for auxiliary power.


The GeForce 6800 Ultra reference card

NVIDIA recommends a power supply rated to at least 480W for the GeForce 6800 Ultra, and the two molex connectors should come from different rails on the power supply, not just a Y cable. For most of us, buying and installing a GeForce 6800 Ultra will require buying and installing a beefy new power supply, as well. (Not that it won’t be worth it.)

Clock speeds for the GeForce 6800 Ultra are 400MHz for the GPU and 550MHz for the 256MB of onboard DDR3 memory (or 1.1GHz once you factor in the double data rate thing.) For the privilege of owning this impressive piece of technology, you can expect to pay about $499.

Although the cooler design on our GeForce 6800 Ultra reference card is a dual-slot affair (that is, it hangs out over the PCI slot adjacent to the AGP slot), NVIDIA does have a single-slot cooler design that it will make available to its board partners. Here’s a picture the company provided:


NVIDIA’s single-slot cooler for the GeForce 6800 series

Expect most GeForce 6800 Ultra boards to occupy two slots, at least initially. However, judging by how much it burns my hand when I touch it, the 6800 Ultra runs quite a bit cooler than the downright sizzlin’ GeForce FX 5950 Ultra. I wouldn’t be shocked to see single-slot designs come back into favor amongst NVIDIA card makers. Then again, the dual-slot cooler runs nice and quiet, and card makers may not want to sacrifice peace for an extra PCI slot.

There will be a non-Ultra version of the GeForce 6800 available before long at $299. That product will be a single slot design with a single Molex connector. However, it will have only twelve pixel pipes, and its clock speeds haven’t been determined yet.

 

Test notes
NVIDIA’s previous generation of cards presented a couple of intriguing problems for benchmarking. Most prominent among those, perhaps, was NVIDIA’s use of an “optimized” method of trilinear filtering, a common texture filtering technique. This “optimized” mode reduces image quality for the sake of additional performance, and over time, it has earned the nickname “brilinear” filtering, because it seems to be a halfway version of real trilinear, with some proximity to bilinear filtering only. Through the course of multiple driver revisions, NVIDIA introduced this new technique, pledged to make it a driver checkbox option, made it a driver checkbox option, turned it on selectively in spite of the driver checkbox setting, and removed the checkbox from the driver.

All of this drama has presented a conundrum for reviewers, because the change really does influence the card’s visual output, if only slightly, and its performance, mildly but perhaps a little less slightly. For many people, some of this stuff must sound like arguing over how many angels can fit on the head of a pin, but we do try to perform apples-to-apples comparisons whenever possible. Fortunately, NVIDIA has provided a checkbox in its NV40 drivers that disables “trilinear optimizations.”


The control panel allows for disabling “brilinear” on the GeForce 6800 Ultra

I will show you the visual and performance differences between “brilinear” and trilinear filtering in the course of this review, so you can see what you think of it. For the sake of fairness, we disabled NVIDIA’s trilinear optimizations on NV40 for the bulk of our comparative benchmarks testing. ATI’s Radeon 9800 XT produces very similar image output to the GeForce 6800 Ultra with trilinear optimizations disabled, as you will see.

Unfortunately, NVIDIA’s 60.72 drivers do not provide the option of disabling trilinear optimizations on the GeForce FX 5950 Ultra, so we were unable to test NVIDIA’s previous generation card at the same image quality as its replacement and its primary competitor.


The default setting is “Quality,” which allows adaptive anisotropic filtering

NVIDIA was also kind enough to expose a setting in its driver that allows the user to disable its adaptive anisotropic filtering optimizations by choosing “High quality” image settings. In this case, both ATI and NVIDIA use adaptive aniso, so disabling this optimization wouldn’t really be fair. Also, I spent some time trying to find visual difference between NVIDIA’s adaptive aniso and non-adaptive aniso (using different angles of inclination, looking for mip-map level of detail changes, doing mathematical “diff” operations between screenshots) and frankly, I didn’t find much of anything. I did benchmark the two modes, as you’ll see in our texture filtering section, but I could find no reason not to leave NVIDIA’s adaptive aniso turned on during our tests.

In the end, none of these settings impact performance more than a few percentage points, and the GeForce 6800 Ultra doesn’t need to worry about its handicap versus the previous generation of cards.

 

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test system was configured like so:

System MSI K8T Neo
Processor AMD Athlon 64 3400+ 2.2GHz
North bridge K8T800
South bridge VT8237
Chipset drivers 4-in-1 v.4.51
ATA 5.1.2600.220
Memory size 1GB (2 DIMMs)
Memory type Corsair TwinX XMS3200LL DDR SDRAM at 400MHz
Hard drive Seagate Barracuda V 120GB SATA 150 
Audio Creative SoundBlaster Live!
OS Microsoft Windows XP Professional
OS updates Service Pack 1, DirectX 9.0b

We used ATI’s CATALYST 4.4 drivers on the Radeon card and ForceWare 60.72 beta 2 on the GeForce cards. One exception: at the request of FutureMark, we used NVIDIA’s 52.16 drivers for all 3DMark benchmarking and image quality tests on the GeForce FX 5950 Ultra.

The test systems’ Windows desktops were set at 1280×1024 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Pixel filling power
Let’s kick off this party with a look at fill rate, that ancient graphics standard of pixel-pushing power. The GeForce 6800 Ultra’s theoreticals for fill rate are staggering. Here are some common high-end cards from the past year or so for comparison, with the 6800 Ultra at the bottom of the table.

  Core clock (MHz) Pixel pipelines  Peak fill rate (Mpixels/s) Texture units per pixel pipeline Peak fill rate (Mtexels/s) Memory clock (MHz) Memory bus width (bits) Peak memory bandwidth (GB/s)
GeForce FX 5800 Ultra 500 4 2000 2 4000 1000 128 16.0
Parhelia-512 220 4 880 4 3520 550 256 17.6
Radeon 9700 Pro 325 8 2600 1 2600 620 256 19.8
Radeon 9800 Pro 380 8 3040 1 3040 680 256 21.8
Radeon 9800 Pro 256MB 380 8 3040 1 3040 700 256 22.4
Radeon 9800 XT 412 8 3296 1 3296 730 256 23.4
GeForce FX 5900 Ultra 450 4 1800 2 3600 850 256 27.2
GeForce FX 5950 Ultra 475 4 1900 2 3800 950 256 30.4
GeForce 6800 Ultra 400 16 6400 1 6400 1100 256 35.2

Holy Moses that’s a lotta oomph! Thanks to its sixteen pixel pipes, the GeForce 6800 Ultra doubles the theoretical peak pixel and textured pixel (texel) fill rates of the Radeon 9800 XT. Of course, memory bandwidth often limits actual fill rates, and any chip as powerful as the GeForce 6800 Ultra is likely to be limited by today’s memory technology. However, with 35.2GB/s of peak memory bandwidth thanks to a 256-bit path to its DDR3 memory, the GeForce 6800 Ultra isn’t exactly deprived.

Let’s put the theory to test with some synthetic fill rate benchmarks.

The theory works out pretty well in this test. With only a single texture, the NV40 can only reach half its theoretical peak, but with multiple textures per pixel, the GPU gets very near to its theoretical peak, and in doing so, it absolutely trounces the Radeon 9800 XT and GeForce FX 5950 Ultra.

For a different look at fill rate, we also tested with D3D RightMark, which lets us see performance with up to eight textures applied.

The GeForce 6800 Ultra again outclasses the other guys by miles, especially in the more common cases of one, two, and four textures per pixel. As an 8×1 design, the Radeon 9800 XT’s performance decays much like the 16×1 NV40, while the 4×2 GeForce FX produces a zig-zag pattern, performing much better with even numbers of textures per pixel.

Now, on to some real games.

 

Benchmark results
We’ve presented our benchmark results in line graphs, and we’ve elected to test a full range of resolutions both with and without edge antialiasing and anisotropic texture filtering. That may seem deeply dorky to you as you look at some of these results, because in older games, none of these cards is straining at all at 1600×1200 without AA and aniso. However, I happen to think that format is illuminating, so bear with me. Sometimes, every once in a while, performance in a game is limited by something other than fill rate, and tests at lower resolutions will show that.

Unreal Tournament 2004

UT2004 seems to be limited to about 70 frames per second no matter what, but once we reach higher resolutions, even without AA and aniso, the gap between the GeForce 6800 Ultra and the older cards begins to become apparent. As the fill rate requirements grow, the NV40 pulls further and further away.

The above numbers were taken with the game’s default settings. To better test these high-end cards, I also tried UT2004 will all the visual quality settings maxed out.

The UT announcer does a lot of cursing, but these cards don’t seem to mind the highest settings one bit. Frame rates are barely affected.

Quake III Arena

Q3A isn’t shy about reaching much higher framerates at low res than UT2004, and once we turn up the resolution and the eye candy, the GeForce 6800 Ultra starts to look like a monster.

 

Comanche 4

Like UT2004, Comanche 4 can’t seem to break 70 frames per second no matter what. This older DirectX 8-class game doesn’t slow down at all at 1600×1200, but with antialiasing, the GeForce 6800 Ultra again pulls far ahead of the Radeon 9800 XT and GeForce FX 5950 Ultra.

Wolfenstein: Enemy Territory

Wolf: ET is based on the Quake III engine, but it appears to use much larger textures and perhaps more polygons than Q3A. Again, the GeForce 6800 Ultra delivers a stompin’.

The Radeon 9800 XT really seems to struggle here, more so than it did with older driver revisions. I’m not quite sure why.

 

Far Cry
Far Cry is easily the most advanced game engine in a game available on store shelves today, with extensive shadowing, lush vegetation, real DirectX 9-class lighting, pixel shader water effects, the works. Of our game benchmarks, this may be the most representative of future games.

However, Far Cry benchmarking is a little dicey, because the game’s demo recording feature doesn’t record things like environmental interactions with perfect accuracy. I was able to manage a fairly repeatable benchmarking scenario by recording the opening stages of the game and using the save/load game feature. The player walks through some tunnels, comes out into the open, and walks down to the beach. There’s no real interaction with the bad guys, which makes the sequence repeatable. Far Cry‘s eye candy is still on prominent display. I used FRAPS to capture frame rates during playback.

Rather than test both with and without forced AA and aniso, I used the in-game settings. For this test, I cranked up the machine spec setting and all the advanced visual settings to “Very High” with the aniso setting maxed out at 4. These settings are significantly more demanding than the game’s defaults, but I really wanted to push these cards for once.

The 5950 trails well behind the Radeon 9800 XT in this DX9 game, as we’ve long feared might be the case. However, the GeForce 6800 Ultra has no trouble outpacing the 9800 XT.

AquaMark

AquaMark appears to be vertex shader limited at lower resolutions, because the GeForce 6800 Ultra outruns the older cards by a fair margin, even at 640×480. The situation doesn’t change much as resolutions increase, either. The NV40 just can’t be denied.

 

Splinter Cell

Splinter Cell is fill rate limited on the Radeon and GeForce FX, but clearly not on the GeForce 6800 Ultra, even at 1600×1200. Let’s look at frame rates over time, so we can see where the peaks and valleys are during gameplay.

At higher resolutions, the other cards’ peaks are looking a lot like the GeForce 6800 Ultra’s valleys. Not only are frame rate averages up with NV40; frame rates are up across the board.

 

Serious Sam SE
Serious Sam can run in either Direct3D or OpenGL. Since the game’s default mode is OpenGL, we ran our tests with that API. To keep things on an even playing field, we used the “Default settings” add-on to defeat Serious Sam SE’s graphics auto-tuning features.

The GeForce FX and Radeon 9800 XT are pretty evenly matched, but the GeForce 6800 Ultra barely breaks a sweat.

With AA and aniso on, the GeForce 6800 Ultra’s frame rate doesn’t dip below 80 FPS at 1600×1200. The Radeon 9800XT doesn’t once reach 60 FPS.

 

3DMark03
At FutureMark’s request, we are using NVIDIA’s 52.16 drivers for the GeForce FX 5950 in this test. FutureMark says newer NVIDIA drivers have not been validated for use with 3DMark03. Unfortunately, we have no way of using a validated driver with the GeForce 6800 Ultra, and FutureMark has given us no choice but to report the results with the 60.72 drivers here. Make of these results what you will.

In 3DMark03’s overall composite score and in each of the component tests, the GeForce 6800 Ultra positively dominates, nearly doubling the Radeon 9800 XT’s overall score at 1024×768. You may have heard rumors that the NV40 could hit over 12,000 in 3DMark03. I’m sure that’s quite true. Our Athlon 64 3400+ test rig isn’t the fastest available platform for 3DMark03—not even close. Coupled with a Pentium 4 Extreme Edition at 3.4GHz, the GeForce 6800 Ultra might top 13K in 3DMark03.

The synthetic pixel and vertex shader tests confirm the GeForce 6800 Ultra’s proficiency. The vertex shader scores appear to support NVIDIA’s claim that NV40 has twice the vertex shader power of NV38.

 

3DMark image quality
The Mother Nature scene from 3DMark has been the source of some controversy over time, so I wanted to include some screenshots to show how the three cards compare. On this page and in all the following pages with screenshots, you’re looking at low-compression JPEG images. You can click on the image to open a new window with a lossless PNG version of the image.


Radeon 9800 XT


GeForce 6800 Ultra


GeForce FX 5950 Ultra

The results look very similar between the three cards, at least to my eye.

 

ShaderMark 2.0
ShaderMark is intended to test pixel shader performance with DirectX 9-class pixel shaders. Specifically, ShaderMark 2.0 is geared toward pixel shader revisions 2.0 and 2.0a. (Version 2.0a or “2.0+” uses longer programs and flow control.) ShaderMark also has the ability to use a “partial precision” hint on NVIDIA hardware to request 16-bit floating point mode. Otherwise, the test uses 32 bit of precision on NVIDIA cards and, no matter what, 24 bits per color channel on the Radeon 9800 XT due to that chip’s pixel shader precision limits.

Unfortunately, even on NV40, some of ShaderMark’s shaders won’t run right. We found this to be the case with the DeltaChrome S8 Nitro, as well, so I suspect the problem has to do with the program looking for ATI-specific capabilities of some sort.

I’d say NVIDIA has put its pixel shader performance problems behind it. The performance delta between the GeForce FX and the GeForce 6800 series is massive. The Radeon 9800 XT is about half the speed, with 24 bits of precision, of the GeForce 6800 Ultra with 32 bits. And in this context, the NV40’s ability to use 16-bit precision looks like another big advantage, because 16 bits is all that’s necessary to render any of these shaders without visible artifacts.

 

ShaderMark image quality
One of the quirks of running ShaderMark on the GeForce 6800 Ultra with 32-bit precision was a texture placement problem with the background texture (not the pixel shaders or the orb thingy, just the background.) The problem didn’t show up in “partial precision” mode, but it did in FP32 mode. Since the changing background texture is distracting, and since 16 bits per color channel is more than adequate for these pixel shaders, I’ve chosen to use the “partial precision” images from the GeForce 6800 Ultra. Also, both modes showed an apparent gamma problem with the pixel shaded objects on the NV40. They’re very bright. There’s nothing I could do about that.

The images shown are the GeForce 6800 Ultra screenshots, until you run your mouse over them, at which point the Radeon-generated images will appear.

Per pixel diffuse lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Point phong lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Spot phong lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Directional anisotropic lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Bump mapping with phong lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Self shadowing bump mapping with phong lighting (move mouse over the image to see the Radeon 9800 XT’s output)

Procedural stone shader (move mouse over the image to see the Radeon 9800 XT’s output)

Procedural wood shader (move mouse over the image to see the Radeon 9800 XT’s output) Aside from the gamma differences, the output is nearly identical.

 

Texture filtering
To measure texture filtering performance, we used Serious Sam SE running at 1600×1200 resolution in Direct3D mode. Note that I’ve tested the GeForce 6800 Ultra with several different settings. The “base” GeForce 6800 Ultra setting is with trilinear optimizations disabled and adaptive anisotropic filtering enabled. I also tested with “brilinear” filtering (or trilinear optimizations) enabled. Finally, the “High Quality” setting disabled adaptive anisotropic filtering, forcing full aniso filtering on all surfaces.

Well, it really doesn’t matter which optimizations are disabled. The GeForce 6800 Ultra is mind-numbingly fast regardless. However, you can see that fudging trilinear with the “brilinear” optimization really does make a difference in performance. Adaptive ansio’s performance impact is much smaller, as is its impact on image quality.

 

Texture filtering quality
Here’s a sample scene from Serious Sam, grabbed in Direct3D mode, that shows texture filtering in action along the wall, the floor, and on that 45-degree inclined surface between the two.


Trilinear filtering – Radeon 9800 XT


Trilinear filtering – GeForce 6800 Ultra


“Brilinear” filtering – GeForce 6800 Ultra


Trilinear filtering – GeForce FX 5950 Ultra

The difference between “brilinear” filtering and true trilinear is difficult to detect with a static screenshot and the naked eye—at least, it is for me. Remember that the GeForce FX 5950 is doing “brilinear” filtering in all cases.

 

Texture filtering quality
Once we dye the different mip maps colors using Serious Sam’s built-in developer tools, we can see the difference between the filtering methods more clearly.


Trilinear filtering – Radeon 9800 XT


Trilinear filtering – GeForce 6800 Ultra


“Brilinear” filtering – GeForce 6800 Ultra


Trilinear filtering – GeForce FX 5950 Ultra

Here’s NVIDIA’s trilinear optimization at work. Mip-map boundary transitions aren’t as smooth as they are on the Radeon 9800 XT and on the GeForce 6800 Ultra with “brilinear” disabled.

 

Anisotropic texture filtering quality


16X anisotropic filtering – Radeon 9800 XT


16X anisotropic filtering – GeForce 6800 Ultra


16X anisotropic + “brilinear” filtering – GeForce 6800


8X anisotropic filtering – GeForce FX 5950 Ultra

At long last, NVIDIA cards are able to do 16X anisotropic filtering plus trilinear. Again, not that we can tell the difference much, at least in this example. All of the cards look great.

 

Anisotropic texture filtering quality


16X anisotropic filtering – Radeon 9800 XT


16X anisotropic filtering – GeForce 6800 Ultra


16X anisotropic + “brilinear” filtering – GeForce 6800 Ultra


8X anisotropic filtering – GeForce FX 5950 Ultra

With our mip maps dyed various colors, the trilinear optimizations become very easy to see here. Notice how much more aggressive the filtering is on the inclined surface between the floor and the wall in the bottom two shots.

 

Antialiasing
You’ve already seen the GeForce 6800 Ultra put up some amazing performance numbers with 4X antialiasing and 8X anisotropic filtering. Let’s have a look at how it scales across the various AA modes.

Not bad, although that 8X AA mode is a killer.

Like ATI, NVIDIA uses multisampling for antialiasing, a technique that avoid unnecessary texture reads on subpixels (or fragments) not on the edge of an object boundary. This is a pretty efficient way to do edge AA, but it does produce some artifacts. Fortunately, NVIDIA has improved the NV40’s antialiasing over the NV30 series in several key ways, one of which is support for “centroid sampling,” which avoids retrieving incorrect texture samples and prevents one of the artifacts associated with multisampling. There was a brouhaha over centroid sampling in relation to Half-Life 2 not long ago, because ATI’s R300 series supports centroid sampling and NV3x requires a workaround. Now, when applications request it, they can get centroid sampling from the NV40.

Another improvement to NV40’s antialiasing is a rotated grid pattern for 4X AA. I’m not sure why, but the GeForce FX series used a sampling grid aligned with the screen. Rotated grids patterns have been around since at least the 3dfx Voodoo 5 series, and they have the advantage of handling of near-vertical and near-horizontal edges better. Rotated grids also help throw off the eye’s pattern recognition instincts when a scene is in motion, arguably producing superior results. The holy grail in AA is a truly random sampling pattern from pixel to pixel, but a rotated grid is a nice first step.

I used a cool little antialiasing pattern viewer to show each card’s AA patterns at different sample rates. The red dots are the geometry sample points, and the green dots are texture sample points. You can see below how the GeForce 6800 Ultra has a rotated grid pattern at 4X AA, while the GeForce FX does not.


Radeon 9800 XT – 2X AA


GeForce 6800 Ultra – 2X AA


GeForce FX 5950 Ultra – 2X AA


Radeon 9800 XT – 4X AA


GeForce 6800 Ultra – 4X AA


GeForce FX 5950 Ultra – 4X AA


Radeon 9800 XT – 6X AA


GeForce 6800 Ultra – 8X AA


GeForce FX 5950 Ultra – 8X AA

The GeForce FX does have rotated grid multisampling at 2X and 8X AA, but not at 4X. The NV40 corrects that oversight. Notice, also, how NVIDIA’s 8X modes appear to do texture sampling closer to the geometry sample points. If this is a correct representation, NVIDIA’s method may eliminate some of the artifacts associated with multisampling without resorting to true centroid sampling.

Also, check out that funky sample pattern for the Radeon 9800 XT at 6X AA. You have got to like that.

 

Antialiasing quality
We’ll start off with non-AA images, just to establish a baseline.


No AA – Radeon 9800 XT


No AA – GeForce 6800 Ultra


No AA – GeForce FX 5950 Ultra

 

Antialiasing quality


2X AA – Radeon 9800 XT


2X AA – GeForce 6800 Ultra


2X AA – GeForce FX 5950 Ultra

2X AA shows little difference between the three cards.

 

Antialiasing quality


4X AA – Radeon 9800 XT


4X AA – GeForce 6800 Ultra


4X AA – GeForce FX 5950 Ultra

At 4X AA, the GeForce 6800 Ultra is clearly superior to the GeForce FX 5950 Ultra on near-vertical and near-horizontal edges, like the edges of the big bomber’s tail wings. To my eye, the Radeon 9800 XT and GeForce 6800 Ultra are pretty well matched. You may want to click through to look at the uncompressed PNG images and see for yourself. I think perhaps, just maybe, the Radeon 9800 XT does a better job of smoothing out jaggies. Just a little.

 

Antialiasing quality


6X AA – Radeon 9800 XT


8X AA – GeForce 6800 Ultra

Notice a couple of clear differences here. Check out the tail section of the plane in the lower left corner of the screen. The Radeon 9800 XT’s 6X AA mode has done a better job smoothing out rough edges than the GeForce 6800 Ultra at 8X AA. That may be the result of ATI’s funky sampling pattern; I’m not sure. Then check out the dark emblem at the top of the tail fin. On the Radeon, it’s very blurry, while on the GeForce 6800 Ultra it’s quite defined. If you click back through and look at the pictures without AA, that marking looks more like it does in the Radeon 6X AA shot.

Interesting. Not sure what to make of that.


6xS AA – GeForce FX 5950 Ultra


8X AA – GeForce FX 5950 Ultra

 

Antialiasing quality
To illustrate the effects of antialiasing, I’ve run a “diff” operation between each card’s original output and 4X antialiased output.


Difference between no AA and 4X AA – Radeon 9800 XT


Difference between no AA and 4X AA – GeForce 6800 Ultra


Difference between no AA and 4X AA – GeForce FX 5950 Ultra

The GeForce 6800 Ultra clearly “touches” more edge pixels than its predecessor. The edges outlines are more pronounced in the GeForce 6800 Ultra image. However, the Radeon 9800 XT touches more pixels than either—though not necessarily more edge pixels. The Radeon appears to modify more of the screen than the GeForce 6800 Ultra, which is probably less efficient.

 

High dynamic range image-based lighting
The GeForce 6800 Ultra now supports the proper floating-point texture formats for my favorite DirectX 9 demo. Have a look at some screenshots to see how it looks.


Radeon 9800 XT


GeForce 6800 Ultra


GeForce FX 5950 Ultra

The banding is obvious on the GeForce FX, but the GeForce 6800 Ultra runs this demo like a champ, faster and I think perhaps with discernibly better output in some cases thanks to its FP32 pixel shaders.

 

High dynamic range image-based lighting
Here’s another example where the GeForce FX struggled.


Radeon 9800 XT


GeForce 6800 Ultra


GeForce FX 5950 Ultra

The FX just couldn’t handle this lighting technique, at least as this program implemented it. The NV40 has no such problems.

In fact, I should mention in relation to high-dynamic-range lighting that the NV40 includes provisions to improve performance and image fidelity for HDR lighting techniques over what ATI’s current GPUs support. John Carmack noted one of the key limitations of first-gen DirectX 9 hardware, including R300, in his .plan file entry from January 2003:

The future is in floating point framebuffers. One of the most noticeable thing this will get you without fundamental algorithm changes is the ability to use a correct display gamma ramp without destroying the dark color precision. Unfortunately, using a floating point framebuffer on the current generation of cards is pretty difficult, because no blending operations are supported, and the primary thing we need to do is add light contributions together in the framebuffer. The workaround is to copy the part of the framebuffer you are going to reference to a texture, and have your fragment program explicitly add that texture, instead of having the separate blend unit do it. This is intrusive enough that I probably won’t hack up the current codebase, instead playing around on a forked version.

So in order to handle light properly, the cards had to use a pixel shader program, causing a fair amount of overhead. The NV40, on the other hand, can do full 16-bit floating-point blends in the framebuffer, making HDR lighting much more practical. Not only that, but NV40 uniquely supports 16-bit floating-point precision for texture filtering, including trilinear and anisotropic filtering up to 16X. I’d hoped to see something really eye-popping, like Devebec’s Fiat Lux running in real time using this technique, but no such luck yet. Perhaps someone will cook up a demo soon.

 
Power consumption
So what happens when you plug a graphics chip with 222 million transistors running at 400MHz into your AGP slot? I got my trusty watt meter and took some readings to find out. I tested the whole system, as shown in our Testing Methods section, to see how many watts it pulled. The monitor was plugged into a separate power source. I took readings with the system idling at the Windows desktop and with it running the real-time HDR lighting demo at 1024×768 full-screen. For the “under load” readings, I let the demo run a good, long while before taking the reading, so the card had plenty of time to heat up and kick on its cooler if needed.

At idle, the NVIDIA chips do a nice job of cutting their power consumption. Under load, the GeForce 6800 Ultra can really suck up the juice. Surprisingly, though, it’s not much worse than the GeForce FX 5950 Ultra.

 


NVIDIA’s Dawn gets an update in the “Nalu” demo

Conclusions
We’ve covered a lot of ground in this review, and I haven’t touched on so many things. NVIDIA was very open about sharing NV40 architectural details, but I haven’t discussed them in any depth. The chip has an all-new video processing unit, too, that deserves some attention, especially because it can encode and decode MPEG 2 and 4 video. There are big issues to address, such as DirectX shader model 3.0 and its differences from 2.0. (Thumbnail sketch: Pixel shader 3.0 will matter for performance more than anything else, because developers will not be writing shaders in assembly. Some elements of the PS 3.0 spec will enhance image quality by requiring FP32 precision, as well.) I’ve even left out some of NV40’s new 3D capabilities, like vertex texture fetch and displacement mapping.

There was simply no time for me to do all this testing and address all of these things in the past week or so. However, at the end of the day, we need to know more about what ATI is cooking up before we can put many of the NV40’s more advanced features into context. That day will come soon enough, and we’ll examine these issues in more detail when it does.

For now, I think we’ve established a few things about the NV40 with all of our testing. First and foremost among these is the fact that NVIDIA isn’t blowing smoke this time around. Many of our crazy tests in this mega-long review are aimed at exposing weaknesses or just verifying proper operation of the GPU, and the GeForce 6800 Ultra passed with flying colors. The NV40 is exceptionally good, with no notable weaknesses in performance or capability. NVIDIA has caught up to ATI’s seminal R300 chip in virtually every respect, while adding a host of new features that make NV40 a better graphics processor, including long shader programs, more mathematical precision, and floating-point framebuffer blends.

And it’s a freaking titan of graphics performance.

Not only is the sixteen-pipe GeForce 6800 Ultra fast, but it’s also rather efficient, extracting gobs more framebuffer bandwidth out of just a little more memory bandwidth, relatively speaking, than the GeForce FX 5950 Ultra.

All of this bodes well for the coming generation of graphics processors based on NV40, from the twelve-pipe GeForce 6800 on down the line. An eight-pipe version of this chip would make a dandy $199 card, and it would almost certainly give an eight-pipe R300 more than it could handle. NVIDIA has left ATI with no margin for error. 

Comments closed
    • mstrmold
    • 16 years ago

    Buub. re-read my post. I said I know I was giving up certain consessions, but
    I don’t believe video should be one of them. To say that taking up two slots for one device is the future of video? Well then I’m sure future designs will merit a change. My whole point quite simply was I’d go Nvidia but they made the choice for me. It looks like the company up north will be seeing my $$$ instead. I was surprised that since Nvidia’s last Gen parts were an automatic no no in a SFF box for high end graphics that they’d want to take down ATI as being “the only choice”. Guess not.

    -E

      • Pete
      • 16 years ago

      Please learn to use the Reply feature. TIA.

    • derFunkenstein
    • 16 years ago

    nVidia left ATI with no room for error when they introduced the GF4Ti series, and ATI responded without error, the R300-core 9700. nVidia fell flat on their faces with the GFFX and ATI got lazy with the R350 and R360, 9800 series. Now that ATI has been left to margin for error, we, the consumer are hoping ATI responds and that the R3x0 series isn’t a fluke. It’s exciting times!

      • PLASTIC SURGEON
      • 16 years ago

      I would not say the R350 core was a lazy design for ATI. It was a refresh the same as the 5900, 5950. Only difference being the 9800 series was still faster then the 5900 series and offered better IQ. ATI did not get lazy. They just played the right cards and won.

    • nexxcat
    • 16 years ago

    did I see dual-DVI? On a reference design of a “consumer” card? Ooooooooooh.

    • us
    • 16 years ago

    Very nice review and great card from Nvidia.

    • newbie_of_jan0502
    • 16 years ago

    WOW! And it’s about time!

      • JustAnEngineer
      • 16 years ago

      NVidia finally has a convincing response to ATI’s August 2002 shipment of Radeon 9700 Pro.

        • Grigory
        • 16 years ago

        Plus a convincing response to any other card ATI has or will have for the next couple of month. (Except the leaked specs are wrong.)

    • Sargent Duck
    • 16 years ago

    The Nv 40 has some nice numbers, and it is quite an improvement over the last generation (which is too be expected), but I’m even more anxious to see a real competitor to the NV 40, aka R420. Of course the Nv 40 is looking real good against a core that’s a year old.

    Although, I’m still a little weary of all that bull Nvidia tried to shove down our throats last year with driver cheating and such. When I look at the screenshots, I can’t help but wonder, “are they cheating again?”

    • Gom
    • 16 years ago

    It doesn’t make sense. Toms has exactly the same conclusion about power. And it doesn’t add up. That much higher load does_not_justify_recommending_a_480W_PSU_!_FFS_!!! It has to be a worst case scenario. “In case you have a cheap PSU and because we know most ppl just don’t HAVE 2 dedicated molex’s for our card we highly recommend u use a 480W PSU when using our new card. Just to be on the safe side. If you however HAVE a good PSU (good 300W PSU’s perform far better than cheap 400W etc) and can connect dedicated cables to our chip, u shouldn’t have a problem”. I mean, otherwise it doesn’t make sense. U saw the watt requirements in the review. It doesn’t justify going up to a 480W PSU. I dare this site to try and see what PSU’s they can make the card run on. Take three expensive from 350W up and three cheap from 350W up and see if they can make the card run without probs.

      • Buub
      • 16 years ago

      My guess is that it’s to cover their butts for people who overclock heavily and have cheap-ass power supplies like the junk that comes with many cases these days.

        • atidriverssuck
        • 16 years ago

        rightly so too, ’cause there is no shortage of those. Let’s not even get into what a cheap psu can do with regards to stressing components…

      • us
      • 16 years ago

      Maybe it’s sensitive to voltage fluctuation? So NV wants more margin.

      • Disco
      • 16 years ago

      Based on the fact that the 6800 doesn’t draw that much more power than the 5950 (and even less than the 9800 at idle) and nVidia’s propensity for hyperbole, I think the power recommendations are simply a marketing gimmick.

      *[<"Our card is so POWERFUL it needs a 500W powersupply!"<]* I'm sure that this card will be fine in much lower rated PSs - I guess we'll find out over the comming months... PS - This card looks good, but I'm waiting for some ATI product - The FX crap from over the last 2 years has left a pretty bitter taste in my mouth. And even if the 6800 remains the fastest, it seems that ATI generally has a better performance/$$ ratio.

        • indeego
        • 16 years ago

        I power supply requirement basically wipes out much of your market. They wouldn’t do it unless they were certain it outweighed the cost/benefit estimates. They HAVE to play safe, because killing a system means they are up for class actiong{<.<}g

    • mstrmold
    • 16 years ago

    So what, for me to be a hardcore gamer, my video card must take up two slots? 🙂
    Sorry, I couldn’t resist. I went from a full tower to a SFF LAN box as they tend to be much quieter. Yes, I gave up on the expandability of the box as far as adding extra hard disks and PCI devices, but I don’t see how that factors in for video. If you have an AGP slot, then it should work. Yes, I know GPU’s actually double the number of transistors as that of CPU’s and I have no issues with higher requirements for power, but needing a cooler that blocks PCI adjacent PCI slots? Crazy. If I remember the older PCI specification, you could not design an add in card that blocked adjacent slots. With AGP it’s different, right? 😉

    Ah well… my whole point was that I couldn’t use it (as stated) but it was inferred by another poster that I “knew” this was how it was to be since I went to an SFF. Arguing about semantics on the internet, I know I know…

    -E

      • Buub
      • 16 years ago

      y[

    • Convert
    • 16 years ago

    This card is what the FX should have been. Plain and simple.

    1. It is interesting to see how close the 5950 is to the 9800xt. How the “experts” talk these days you would have thought it was comparable to the 9200.

    2. Most people need to close their mouth over the cooling solution. Take a look at how many different cooling solutions are out there for retail cards. There are a few reasons why they call it a sample.

    3. Did I mention this is what the fx should have been?

    4. Have people already forgotten how well their cards progress after some driver updates. And NO I am not talking about the historic FX “tweaks”. Take their past cards as an example.

    5. I was actually really surprised to see this as a pure agp incarnation. I am curious as to what the PCI-E interface will bring.

    6. This card doesn’t actually suck that much power, as you can plainly see by the tests. The idle speeds make up for its peak usage. Everyone I know uses their computer more often for browsing/piddly tasks than strait gaming all day long. I am curious though as to the power required for a cold boot, that’s going to be the real test.

    7. Just get over the whole SFF FOR PETES SAKE! I will see where you’re coming from when your % grows. I don’t want to hear the “I am a system integrator and SFF is on a steep rise, OMG!!111”. I know a very large amount of system integrators, SFF is increasing, just as any new tech would but it’s nothing to write home about….. Lets see what RETAIL cards do for cooling, I am sure some manufacture out there wants to cater to the sff crowd, remember this isn’t ati were talking about, nvidia relies on other manufacturers. I DO see where you’re coming from if this thing blows out your PS at boot up, by all means I am with you on that one. Also, for some cases the agp is up against the side of the case, there is room for some extra cooling (notice how the cooling doesn’t point out the back of the case requiring another open slot, now all you need is space).

    “/me let’s nVidia out of the Dog House on probation.”

    Same here, nvidia screwed up way too big the last time around. They have some work ahead of them to restore the trust of countless enthusiasts.

    I am slightly worried over all. While the improvements were huge on the higher rez tests the regular test leave you wanting more (maybe there is some type of bottleneck going on). Again perhaps some real drivers will help. It’s almost safe to say without a doubt ATI is going to give this card a good run for its money. As a consumer that’s music to my ears.

    You guys really did a great job with this review. I can’t wait to see what you do for the ATI card. Thank you very much for your time, I loved every minute.

      • My Johnson
      • 16 years ago

      In HDR lighting the FX failed terribly. It’s no comparison to an R3XX.

        • Convert
        • 16 years ago

        Your point? I am not sticking up for the 5950 by any means, the thing has some issues plain and simple. I would imagine your responding to the “9200” comment.

        The card wasn’t a total push over, every single benchmark out there covering the 6800 is showing that the 5950 is EXTREMELY close to the 9800XT, there might be a test in there were the 5950 hits rock bottom but I also saw a test on [H] that shows one test where the 9800XT is stomped, prob driver issues though.

        I don’t want to get into a “My dick is bigger” conversation over which card to pick, if I wanted that I could have put a whole lot more into #1.

          • My Johnson
          • 16 years ago

          Oh yeah, the 5950 is good enough for me too, but it comes down to IQ if speed is equal.

      • MorgZ
      • 16 years ago

      dude the bottleneck is probably in the games themselves or the rest of the system.

      As others have said though, im most interested at the farcry benchmarks which were not as wonderful as i had been hoping (even though the card as a whole seems awesome)

    • mstrmold
    • 16 years ago

    Hey Buub,

    Sorry to burst your bubble, but I am running an Athlon64 3400+. I’d consider that pretty top of the line for an SFF. Trade offs? Yeah, but as a person who works for a manufacturer of PC parts, we look at market trends and guess what? The XPC or SFF market is growing leaps and bounds vs. the “shove everything into an Antec 1030SX” crowd.
    Burst my bubble… right. Thats why ATI’s solution (don’t know about their upcoming part) fits just fine and has historically (past 18 months) kicked the crap out of Nvidia without the use of two power connectors and covering up an adjacent PCI slot. My statement was that it was too bad they didn’t consider these “limitations”. They are alienating a growing market demographic.

    As far a Prescott is concerned? Don’t think I’d ever own one as I’m an AMD person myself. But I understand the example 🙂

    -E

      • Krogoth
      • 16 years ago

      The 6800U appeals to hardcore gamer market who happened to utillize tower and full-tower chassis. XPC just don’t have the room or power supply large enough to accommodate the 6800U. I doubt the X800 will be any better ,since it’s likely to be as much as power hog as the 6800U.

      • Krogoth
      • 16 years ago

      OOPSSS….. Sorry for the double posting my finger slipped 🙁

        • Krogoth
        • 16 years ago

        OOPSSS….. Sorry for the double posting 🙁

      • Buub
      • 16 years ago

      OK, so to look at it from your other argument, nVidia shouldn’t make any card that doesn’t fit in a SFF system, because they are becoming so popular? Say that out loud. Does it sound silly to you too?

        • Krogoth
        • 16 years ago

        Nvidia is going have to make a cut-down verison of 6800 to fit in a XPC. It’s going to have some extra TV/HDTV in/out stuff. There’s quite a lot of XPCs out there, which are primarily used as PVRs and Multimedia centers not killer gaming rigs. Besides, the Radeon 9800PRO still has more then enough power to last until Q2’05.

      • Berylium
      • 16 years ago

      Did I miss something here? The card reviewed is a sample unit, there’s even a picture of a 6800 that takes up just the AGP slot. So if there is a (probably) quieter solution that takes up a PCI slot and a (probably) louder solution that takes up only the AGP slot doesn’t that solve the “I have a SFF computer you insensitive clod” problems?

      Given how massive processor heatsinks have become and how many more transistors and power sapping issues these new graphics cards have I think it’s a miracle there’s even going to be a cooling solution that can fit in only an AGP slot at all.

    • elmopuddy
    • 16 years ago

    6800 = The Return of the King? 🙂

    About bloody time!

    EP

    • DrDillyBar
    • 16 years ago

    /me let’s nVidia out of the Dog House on probation.

      • My Johnson
      • 16 years ago

      Woof! er, Ruff!

    • Ha2z
    • 16 years ago

    The GeForce 6800 Ultra was only about an average 20% faster than the Radeon 9800 XT in Farcry. Older games run fast enough with older cards anyway.

    Edit: Anandtech got the 6800 to run 60% faster than the 9800XT in Farcry at 1600×1200 4xAA/8xAF. So it can potentially be alot faster depending on the settings/situation,

      • wagsbags
      • 16 years ago

      ok here’s where you need to pay attention to where the speed counts. You don’t need 200 frames per second at low resolution, you need <40 or whatever at high resolution. And this is where this card performs.

        • hmmm
        • 16 years ago

        You’re right that this is where speed matters. It doesn’t matter if it is ten bazillion percent faster if it doesn’t meet the minimum playability threshold.

        You’re also right that it will let you bump up the resolution in a lot of older games. It doesn’t, however, perform as well in Far Cry. By your standard 10×7 is playable on R350 and NV40. 12×10 certainly isn’t playable above that on R350. NV40 might just scrape by @ 12×10. But that would be cutting it pretty close.

    • AmishRakeFight
    • 16 years ago

    although none of my shuttle cubes will be able to handle the power consumption, looks like a kick butt card.

    • OsakaJ
    • 16 years ago

    Very nice review. Now, can I have the card you used, please?
    🙂
    The performance in FarCry was a little disappointing, though. I wonder if the limits in FarCry performance are caused by immature code, drivers or CPU power. Any guesses, anyone?

      • indeego
      • 16 years ago

      Lots of motion all the time on the screen all at once?

      What is strange with Far Cry is it seems to slow down on my system when I go from *[

        • NeXus 6
        • 16 years ago

        There’s more lighting effects indoors. When set to Very High, advanced pixel/vertex shaders are used heavily.

    • packfan_dave
    • 16 years ago

    I can’t be the only one who’s really just wondering what the GeForce 6500 (or whatever the midrange card is called) and the Radeon X600 (same deal) are like.

    • indeego
    • 16 years ago

    One of the best TR reviews I’ve seen in a while. Fantastic work.

    /me sends some cash to TR donation jar. wootg{

    • graywulf2002
    • 16 years ago

    One question I have never found an answer for….. why is the actual GPU attached to the bottom of the graphics card??
    I mean, an AGP card is usually mounted with the die facing downwards, and the heatsink is blowing hot air downwards (possibly on other PCI cards).
    What reason prohibits mounting the GPU die on the back of the card?
    The heatsink would not be in conflict with other cards, the hot air would be pushed towards the top of the case (=PSU with fan)…. no real disadvantage I can discern.
    Most cards have memory chips as well as SMD resistors and capacitors soldered to the rear – manufacturing reasons are not going to be the problem here.
    Perhaps someone can give me a reasonable answer – I have no idea why this route is not taken and would like to hear some comment.

      • just brew it!
      • 16 years ago

      It is done this way because ATX spec says that you aren’t supposed to have any components taller than 0.105 inches mounted on the back side of the card. (So technically the cards using the Zalman heatpipe cooler are out-of-spec as well.)

      I agree, it would be a good idea for someone to produce cards like this… it would probably result in better GPU cooling for many people, especially people who have a lot of PCI cards. Cards with the GPU on the back side will not be usable in all motherboards though. (E.g. my Tyan Tiger MPX has one of the CPU sockets very close to the back side of the AGP card… a card with the GPU on the back simply would not fit.)

      • Buub
      • 16 years ago

      Probably because there are a lot of cases and motherboards that card would not fit in/on if they put the GPU on the other side.

      However the question I have:
      When will motherboard manufacturers get a clue and move the AGP/PCI-E slot up one more position, leaving a blank so two-slot cards don’t actually use up an extra slot?

        • Lucky Jack Aubrey
        • 16 years ago

        I hear you, but if memory serves, the first PCI slot on the mobos I’ve seen specs for (admittedly not all that many) shares an IRQ with the AGP slot. I’ve been leaving that slot blank regardless, because I don’t want another device sharing an interrupt with the graphics card.

          • indeego
          • 16 years ago

          “because I don’t want another device sharing an interrupt with the graphics card.”

          On a properly built ACPI system this (sharing IRQ’s) should never be a problemg{<.<}g If you don't have a proper ACPI system built then you are compromising in other areas or for linux compatibility, where it wouldn't matter much anywayg{<.<}g

        • Pete
        • 16 years ago

        BTX will arrive just in time for these monstrous GPUs, I guess. The cooler will now face up, and will be in line with more case airflow. Win win.

          • indeego
          • 16 years ago

          I can’t see how a case redesign will make up for wattage/heat production going up each time a new hardware release is released. I mean, perhaps it’ll help airflow and smooth things along a tad, but you’re going to reach a point where it’s diminishing returnsg{<.<}g

            • Pete
            • 16 years ago

            I suppose, but is there another alternative? All you can do is provide the best cooling possible, as future chips will always use more power and create more heat. Anyway, I’m not sure how, but it seems GPUs can handle higher heat than CPUs. Well, the 5950s and 9800XTs were hitting 80C, though that may not have any bearing on these 200+M transistor monsters.

    • ExpansionSSS
    • 16 years ago

    True, this card is fast, but down the line all it does is make The next gen games run at 60fps instead of 40fps. Thats about it…

    The older games were already running solid 60+ all the time, so 60+ more frames does nothing for me.

    You can look at these benches and know exactly what the next gen will look like ( Doom 3 + HL2 will run like Wolfenstein + Serious Sam do today )

      • MadCatz
      • 16 years ago

      My thoughts exactly! The one game I really wanted to see run faster was Far Cry, but all the 6800Ultra does is add about 10-12 fps. Seeing the original performance difference between the 9800XT and 5950Ultra in this game, all I can hope for is that the X800 will crush the 6800Ultra and get 60+ fps all the time. Who really cares about running older games at 100+ fps…improve the newer generation ones!

        • MorgZ
        • 16 years ago

        albeit a true and shared comment, i do agree with the consensus that hl2 / doom3 cud possibly run better on these nxt gen cards than farcry.

        Remember that farcry is drop dead gorge on less than full settings also and nVidia cud still crank up the fps with newer drivers.

      • Krogoth
      • 16 years ago

      The 6800U case you haven’t noticed is currently bottlenecked by the CPU in newer games. Wait until, the 939 FXs to come out to see the true potental of 6800U in Far Cry.

        • hmmm
        • 16 years ago

        Maybe FC is CPU-bound to around 60 FPS, but even the 6800U’s performance drops off significantly as one cranks up the resolution and adds AA and AF. So I don’t think we can say its performance is CPU limited.

    • Dposcorp
    • 16 years ago

    Great review Scott.
    I can not wait to see ATI’s answer.

    Regardless, its nice to see the next gen of cards finally coming out; it was a bit of a long wait.
    (The original TR article by you on the 9700Pro is dated September 16, 2002; over a year and a half ago)

    §[< http://www.techreport.com/reviews/2002q3/radeon-9700pro/index.x?pg=1<]§

      • My Johnson
      • 16 years ago

      God, that Bitch is huge.

    • Divefire
    • 16 years ago

    Well that’s a very comprohensive review, and an interesting read with out getting bogged down in the PR. Nicely done.

    On to the card itself, I know it sounds a little crazy but I was expecting more… With the rumours of 12000+ in 3DMarks03, I was expecting some stealer benchmarks in something along the lines of Far Cry and the other DX9 benchmarkers.

    I just don’t see how a doubling of the 03 score over the 9800 XT translates into a 20 fps increase in Far Cry and the real time HRD demo. Still, perhapse I’m missing something.

    A couple of interesting points though, the power consumption of the card doesn’t seem any where near bad enough for the two molex conectors compared to the XT and FX.

    Oh and mpeg 4 decoding on the fly? That’ll be the warez’s new friend. Oh and TiVio of course…

    • meanfriend
    • 16 years ago

    Well, finally something to make me consider upgrading from by GF4 Ti4200, though I’ll also wait for the mid-range version and try to overclock it. (One could buy a Gamecube, gameboy advance, Xbox, AND PS2 for the $500 this vid card alone costs)

    People should also check out the review on anandtech. More or less the same conclusions but they use a different set of games (including NWN, WC3, Halo, and others)

    One thing I havent seen yet is some CPU scaling tests. This definitely looks like a card for the future but I wonder if my P4 2.4 can keep up?

    I’m now eager to see how ATI will respond. The FX generation didn’t to Nvidia any favors but they seem to be back on track now…

    • Randy2
    • 16 years ago

    Nice job comparing old to new technology. I guess I don’t really see the purpose in it. I think the effort of this review should have been put into ATI’s next offering versus Nvidia’s. It’s like, we all knew the outcome of this review before even reading it.

    Well, I guess the “where is Half Life 2” mystery has been solved. Nvidia knew they would be humiliated if that game hit the shelves before they could come up with another card that could actually play that game. Problem is, now everyone that bought Nvidia cards will need to reinvest again, along with a new power supply.

    Does this card come with a molex Y connector, LOL.

    • Pete
    • 16 years ago

    Great preview, Scott. How’s ATi treatin’ ya? 🙂

    q[< Then check out the dark emblem at the top of the tail fin. On the Radeon, it's very blurry, while on the GeForce 6800 Ultra it's quite defined. If you click back through and look at the pictures without AA, that marking looks more like it does in the Radeon 6X AA shot. Interesting. Not sure what to make of that. <]q Isn't this just supersampling filtering textures? All the textures in the 8x screen look more defined, in keeping with SSAA's texture benefits. 8x on NV40 is MS+SS (the AA tester screens show 2xMSAA on top of 4xSSAA, AFAIK), as with NV3x, right? One more thing: I've read FS' and your review, and I'm in the process of reading a few more (tabbed browsing, baby), but I haven't seen people test different texture stages in addition to comparing trilinear optimization states. Have we learned nothing from the past? 🙂 Remember that ATi quality AF when forced via the control panel is only tri on the first texture stage, and bi for the rest. Is this the same case with nV, and with optimizations on and off?

      • 5150
      • 16 years ago

      Are the Olsen twins 18 yet?

        • Krogoth
        • 16 years ago

        I just hate the Olsen twins and their empire, who really gives a crap anyway? They’ll probably end-up being like Paris Hilton.

          • Chryx
          • 16 years ago

          You mean they’ll end up in amateur porn videos available for download from the intarweb?!?

          • RyanVM
          • 16 years ago

          One can only hope 😀

        • ExpansionSSS
        • 16 years ago

        go to the link

    • Kraft75
    • 16 years ago

    No doubt about it, this card is fast.

    My main rig has an Ati Radeon 9700 Pro. I have a Shuttle SN41G2 as a 2nd PC using the onboard Geforce4 MX. My girlfriend uses it once in a while to game with me. I just find the nVidia display control panel very clumbsy, with that left tab thingny. I didn’t realize until I saw the screenshots from this review that the same control panel is used all the way to the high-end 6800 Ultra.

    Is it just me or is the nVidia interface very clumbsy? Maybe I’m just too used to the Ati interface.

    *gettin’ old*

    *inflexibility slowly setting in*

    *stubborness raising to level 1*

    😀

    • Ruiner
    • 16 years ago

    Regarding the 2 molexes from separate rails thing…

    In every PSU I’ve gotten into, the rails (12v, 5v) from the molexes all connect to the same point anyway. Are they implying that the ~18g wire is the chokepoint?

      • Pete
      • 16 years ago

      Yes, I believe each power connector can’t handle more than a certain amount of wattage, so they use two to be safe.

    • mstrmold
    • 16 years ago

    Yes, but again Nvidia has ignored us SFF people 🙁 You see, two slot versions aren’t going to fit, and if the power recommendation from them requires such a monster of a power supply, well, then we are SOL. The greatest power supply available to the Shuttle public (I have an SN45G and an SN85G4) is the SilentX 250 watt. No where near enough power for this beast.

    It’s too bad as this is the first Nvidia product I’d consider buying since my Geforce 3 video card. The image quality looks top notch to which ATI has always dominated in that segment. Hopefully either Nvidia relaxes the power requirements on a single slot version and doesn’t retard the performance, or, Shuttle (and other SFF manufacturer’s) come up with a power supply that can handle this monster.

    -E

      • Buub
      • 16 years ago

      You don’t seriously believe that every card should run in a SFF system do you? When you chose such a system, you did so knowing there were compromises. Well, you’re not going to be able to run the highest-end graphics cards in there. Sorry. That’s just the way it is. Just like you won’t be running a 3GHz+ Prescott in there either.

      nVidia will be making a lower-end version of the same card that, I’m sure, will work just fine in your box. If you want the most powerful stuff, you’re going to need a bigger box.

      • liar
      • 16 years ago

      Maybe an Antec Aria…

      • boing
      • 16 years ago

      Tom’s had a much better review, showing the card doesn’t drain more power than an 9800 XT actually.

        • Dissonance
        • 16 years ago

        Actually, the 6800 Ultra draws less power at idle, and only more power under load. Check page 28 of our review.

        • emkubed
        • 16 years ago

        TR did power consumption tests also. I’ll take their tests over Thoms any day.

    • pedro_roach
    • 16 years ago

    When are we going to see the card hit store shelves? Its all nice and good that this time it looks like nvidia got it right, but when can we get them?

    • DaveJB
    • 16 years ago

    Looks good, but there could yet be a fly in the ointment. According to a piece at XBit, the R420/423’s core speeds are going to be a lot higher than the 6800U’s – 600MHz on the 16-pipe version, compared to the 6800U’s 400MHz. With the memory clocks being similar, the 6800U could be in for a lot of hurting, unless nVidia can ramp this thing up fast!

    §[< http://www.xbitlabs.com/news/video/display/20040413124046.html<]§

      • RyanVM
      • 16 years ago

      Unless they’re memory bandwidth limited. (Maybe the 800MHz GDDR3 rumors will prove to be true, though).

    • atidriverssuck
    • 16 years ago

    oh my. My oh my oh my. This chip sho’nuff got the goods. For the first time in a long time I can sense upgrade fever.
    Yes, Upgrade Fever ™.

    When the mainstream models come out based on this tech, (the cut-down ones), I think I will pounce.

    Pounce, I tellsya.
    w3rd.

    • DukenukemX
    • 16 years ago

    Anyone else notice that the Geforce 6800 can’t complete all the Shader Mark 2.0 tests? Just like all the Geforce FX’s.

    It’s also interesting that the “FX” name was removed. Anyone who owns a Geforce FX should notice that even Nvidia doesn’t want it’s new product to be associated with the reputation of the “FX” name.

    Also most of the improvements in the Geforce 6800 are already in the Radeon 9800 Pro/XT.

    Nvidia should advertise the Geforce 6800 as “The way FX was meant to be”. If they sell it for $299 – $399 they improved in areas besides performance.

      • atidriverssuck
      • 16 years ago

      Shadermark is a niggling issue, yes.

        • Ryszard
        • 16 years ago

        It can do ShaderMark at full precision, just the app requests a D3D format NV40 doesn’t expose. If ShaderMark is changed to use one of NV40’s other 16-bit FP texture formats, it should run just fine.

        Rys

    • fyo
    • 16 years ago

    So how many of the 200+ million transistors are additional caches? That would help explain how it can be so much more effecient with only a tad extra memory bandwidth.

    What we really need is a die photo 😉

    -fyo

    • Chryx
    • 16 years ago

    g{

      • fyo
      • 16 years ago

      If it can real-time MPEG4 encode, that would be extremely sweet!

      That would make the low-end version (providing they don’t cut this feature out) perfect for HTPC duties ;-).

      -fyo

    • RyanVM
    • 16 years ago

    Great review, Damage (one of the better ones I’ve seen so far today). What I liked best was the mouseover image quality comparisons. It really makes life easier to see the differences. Maybe you could consider doing that in more reviews (and more frequently within).

    Heck, you could even make little flash jobs that can cycle through various cards (maybe even with checkboxes at the top to select which cards/modes you want to see). Meh, it would be cool anyway 😉

    Either way, great job on the review! 🙂

    EDIT: Forgot to say, bring on the x800XT! And I found it funny that a 2.2GHz A64 was the bottleneck in a lot of the benchmarks. It scares me to think what a faster system could do with that card.

      • Kevin
      • 16 years ago

      OH NO! Not flash! 😉

        • RyanVM
        • 16 years ago

        Meh, Anand already uses it. I think the player is standard enough that there wouldn’t be any real problems with it. And you have to admit, the idea of being able to customize the comparison to what you want to see would be cool 😛

          • Kevin
          • 16 years ago

          Oh, it might be cool. But that doesn’t compensate for the overall suckiness that is flash. The less flash in this world, the better place it will be. 😉

            • RyanVM
            • 16 years ago

            I suppose you could always hold out hope that MNG takes off 😉

            • rika13
            • 16 years ago

            how about gif or motion jpeg or another animated pic format?

        • Disco
        • 16 years ago

        I really like that idea of choosing what you want to compare – especially with the mouseover effect – It makes it so much easier than flipping between pages.

        Keep up the great work – awesome review!

    • EbolaV
    • 16 years ago

    Well it is nice to see a Nvidia card that I would consider buying again. I like reading different review styles and what got me to buy my Radeon 9800 Pro was the fact that its picture quality was almost always better than Nvidia’s when there were “anomolies”. Now it appears that this is a thing of the past for the most part and we have good old fashion competition again. Before this card, Nvidia was no competition for ATI in my opinion. Now too, I am eager to see what ATI has to offer. However, its super card will be out at the end of May. The 12-pipeline card they release at the end of April will give a good indication of where they will stand.

    What was nice to read at HardOCP was that they used a 434W Enermax power supply with no issues. I am hoping that my 430 True Power will be enough when I decide to upgrade in a year 😉

      • Autonomous Gerbil
      • 16 years ago

      As for power supplies, I wonder if Nvidia isn’t just playing it safe with their recommendations. We know how much power supplies can vary even when they’re rated the same, so maybe they decided that if they gave a really high recommendation, there wouldn’t be any problems. I’m just guessing, but I bet we find a bunch of lower-rated PSU’s that work just fine.

    • palhen
    • 16 years ago

    Isn’t this one of the most impressive performance bosts for a new generation of cards in a long time? We are talking improvements of up to 50 % sometimes… I’m impressed, but preferes consoles for gaming… :-/

    Edit: bad spelling…

    • just brew it!
    • 16 years ago

    Great review.

    The NV40 is looking pretty damn good. I’m really glad to see that nVidia isn’t going to “pull a 3dfx” on us this time around; we need the video card market to be a 2-horse (at least) race.

    And even though I don’t buy high-end video cards, I’m pretty excited about what this is going to mean for the prices (and performance) of mid-range cards, once ATi and nVidia are finished rearranging their product lines.

      • PerfectCr
      • 16 years ago

      Agreed, I’d love to get my hands on a mid-range NV40 derivative. I am a SFF junkie and couldn’t handle the full 2 Molex NV40 Ultra in my SN85G4. A nice $199 6800 card would be perfect.

      • Autonomous Gerbil
      • 16 years ago

      I think they did “pull a 3dfx” on us. Isn’t this the first GPU our of Nvidia that ex- 3dfx engineers had a hand in?

        • just brew it!
        • 16 years ago

        You know what I meant. 😉

        And actually, wasn’t the GeForce FX series also partially based on 3dfx’s work?

          • Autonomous Gerbil
          • 16 years ago

          I thought I remembered hearing that NV40 would be the first chip to incorporate 3dfx technology, but I wouldn’t say I was positive of it.

          BTW, I did know what you were getting at. I’m just silly.

            • Pete
            • 16 years ago

            No, I’m pretty sure the GeForce “FX” *hint hint* line was the first to incorporate the work of 3d”fx” engineers. You can read Beyond3D’s 6800U preview to see how a lead 3dfx engy was responsible for a greater portion of NV40.

      • Rakhmaninov3
      • 16 years ago

      Me too. Of course I hardly have a reason to upgrade from my Rad8500 128MB, given that I hardly have time to play any games these days:-(

    • WaltC
    • 16 years ago

    *Self NUKED*

    • robg1701
    • 16 years ago

    OH MY GAWWWD

    finally 16×12 with all the trimmings, hell my monitor does 2048×1536 it could be time to push it 😀

    • Krogoth
    • 16 years ago

    It’s seems to me it’s Deja Vu again! The NV40 has effectively crushed high-end R3xx series at areas where it crushed NV25 back 18 months ago. It looks like we have a winner here without Nvidia resorting to lame driver cheating! I hope ATI’s X800 series wouldn’t be far behind. It’s a shame that the NV40 is practically overkill for most gamers out there who still run on older AXP and P4C systems. Anyway, I can’t wait for the workstation verison of NV40, and how it will fare against 3Dlabs’s Wildcat series.

    I still like my Radeon 9700PRO, despite the light of review. It still holds it’s ground pretty well against the newer breed of cards. It’s about time for something that literately blows it out of water in the ultra-high resoluton/AF/AA arena, when the faster 939 FXs get here.

    • WaltC
    • 16 years ago

    Just started reading this, and was eager to see someone finally using 3dMark synthetic tests to check on the 16 pixel pipeline claim. Was disappointed to see that you for some reason declined to run megaPixel fill rate tests to check this, but ran megaTexel tests instead.

    Your results on fill rate therefore are consistent with an 8×2 pipeline organization running 8×1 single, 8×2 multi. Running the megaPixel fill rate tests in addition to the megaTexel tests would have pretty much proven whether the pipeline is 16×1 or 8×2.

    It strikes me though that if the organization was indeed 16×1, then the single texture fill rate tests should closely equal the multitexture fill rate tests, since in the first case you’d be looking at 16×1 for single texturing and in the second case 8×2 for multitexturing. Right? Or, what have I overlooked? (Still pretty early, here.)

      • Dissonance
      • 16 years ago

      Look at the RightMark Fill rate test again, with one texture it’s hitting 4227 megatexels/sec–more than an 8×1 at 400MHz would be capable of.

      Also notice how the 9800 XT’s single- and multi-texturing fill rates in 3DMark03 are quite different, the latter almost doubling the former, just like with the 6800 Ultra.

        • WaltC
        • 16 years ago

        Agreed, but these kinds of test results rarely if ever exactly parallel the manufacturer’s specifications. With megaPixel tests the number of pixels produced by the pixel pipelines is tested, and in that test the single-texturing pixel fill rate of the nV40 should be about double the multitexturing pixel fill rate, since, like the R3x0, the first test should record values consistent with 16×1, but drop back to 8×2 for the second (as the R3x0 does 8×1 and drops back to 4×2.)

        If the RightMark test is accurate, then with 1 texture at 400MHz in a 16×1 organization we should see something closer to 6400 mTexels, right? Yes, 4227 mTexels is more than we’d expect for 8×1, but it also is significantly less than we’d expect for 16×1 at 400MHz…;)

        By using only megatexel tests, it isn’t as clear as it could be, and it helps to run both pixel and texel fill rate tests for this reason. Of course, I’m not trying to make any kind of statement either way, I just think the pixel pipeline issue needs more verification testing (in light of what nVidia represented for nV3x in that regard.)

        • WaltC
        • 16 years ago

        Actually, in waking up a bit and looking at some other reports, too, I really am satisfied that the 16×1 claim is accurate.

    • Ricardo Dawkins
    • 16 years ago

    Me doesn’t…..

    but..still let the WAR begin !!!

      • emkubed
      • 16 years ago

      Oh absolutely. Now I’m wanting to see the ATI review more than ever.

    • Forge
    • 16 years ago

    It’s OK. I’ll take one. SP.

    • PerfectCr
    • 16 years ago

    I like it! FP!

Pin It on Pinterest

Share This