AMD’s Radeon HD 2400 and 2600 graphics processors

AMD FINALLY PULLED BACK the curtain on the remainder of its DirectX 10 GPU lineup recently, and we, erm, kinda stumbled on getting the review out at that time. I offered various excuses, including word of a small fire in Damage Labs and limited time with the cards themselves, all of which were true—and dramatic, which really helps sell an excuse. But I was also held back from producing a review by the same borderline obsessive-compulsive impulse that drives us to produce detailed reviews with extensive test results and commentary. We had to get things tested to our satisfaction.

At last, I’m pleased to report, our review is complete. The result isn’t perfect by any means (as we are keenly aware), but we do have a number of intriguing things to offer, including a look at the new Radeon HD cards’ Avivo HD video acceleration capabilities, with tests of CPU utilization, image quality, and power use during playback. We also have a 3D graphics performance comparison, complete with some thoughts about why ATI’s new GPUs tend to fall short of expectations in that department. Keep reading for our take on the new low-end and mid-range Radeons.

RV630 and RV610 burst onto the scene
AMD’s family of DirectX 10-class graphics processors is comprised of a trio of GPUs, the R600, RV630, and RV610. We first reviewed the R600 when it launched back in May as the Radeon HD 2900 XT. We covered the basic R600-series technology then, and I’ll try to avoid repeating myself here. Go read that review if you want to know more about the core tech. The two new chips with which we’re concerned today are derivates of R600, GPUs largely based on the same internal logic but scaled back with less internal parallelism to make smaller, cheaper chips.

The RV610 and RV630 share the R600’s unified shader architecture, which dynamically deploys on-chip computational resources to address the most pressing graphics problem at hand, whether it be for pixel shading or vertex processing and manipulation. AMD’s new five-wide execution unit is the basic building block of this shader engine. Each of the five arithmetic logic units (ALUs) in this superscalar unit can execute a separate instruction, leading AMD to count them as five “stream processors.” In theory, this architecture ought to make RV610 and RV630 more efficient than their immediate predecessors in the Radeon X1300 and X1650 series.

As DirectX 10-compliant GPUs, these R600 derivatives can perform a number of helpful new tricks, including streaming data out of the shader core—a modification of the traditional graphics pipeline needed to enable DX10’s geometry shader capability. And for you true propellerheads, these GPUs offer a more complete implementation of floating-point datatypes, with mathematical precision that largely meets the requirements of the IEEE 754 spec.

Beyond the 3D graphics stuff, RV610 and RV630 pack some HD video playback capabilities that they didn’t inherit from the R600. AMD has packaged up these video playback features under the marketing name “Avivo HD.” The most prominent of them is a facility AMD has dubbed UVD, for universal video decoder. UVD handles key portions of the decoding process for high-def codecs like H.264 and VC-1, lowering the CPU burden during playback of HD DVD and Blu-ray movies. (Despite AMD’s initial hints to the contrary, the Radeon HD 2900 XT lacks UVD acceleration logic.) The lower-end Radeon HDs also feature hardware acceleration of deinterlacing, vertical and horizontal scaling, and color correction. With considerably more power at its disposal, the big brother R600 handles these jobs in its shader core.

You may want to display your games or movies on a gloriously mammoth display, and the Radeon HD family has you covered there, as well. All R600-family GPUs have dual-link DVI display outputs with support for HDCP, that wonderful bit of copy protection your video card must support if you want to play HD movies on a digital display (without taking the additional 30 seconds required to install software to bypass it). AMD has embedded the HDCP crypto keys directly in these GPUs, simplifying card design by removing the need for a separate crypto ROM and—we hope—ensuring consistent support for HDCP across all Radeon HD cards.

If your display of choice has an HDMI connection, as many big-screen TVs do these days, the Radeon HD can talk to it via an HDMI adapter that plugs into a DVI port. Uniquely, the R600 and its derivates have an audio controller built in, which they can use to pass 5.1-channel surround sound through an HDMI connection—delivering your digital audiovisual content in a nice, big copy-protected package, just as Hollywood has demanded. This may not be the most exciting of features, but it’s more or less necessary for playing back HD movies with digital fidelity.

Le chips
The RV630 GPU will power cards in the Radeon HD 2600 line. AMD has scaled down this mid-range chip in a number of dimensions, as the handy block diagram below helps illustrate.


Logical block diagram of the RV630 GPU. Source: AMD.

The original R600 has a total a four SIMD units, each of which has 16 execution units in it, for a total of 320 stream processors. As you can see in the middle of the diagram above, the RV630 has three SIMD units, and each of those has only eight execution units onboard. That adds up to 120 stream processors—still quite a few, and vastly more than the 32 SPs in the competing GeForce 8600. (For what it’s worth, the smaller number of units per SIMD should improve the RV630’s efficiency when executing shaders with dynamic branches, since the chip’s basic branch granularity is determined by the width of the SIMD engine.)

AMD has also scaled down the RV630’s texturing and pixel output capabilities by reducing the number of texture processing units to two and leaving only a single render back-end. As a result, the RV630 can filter eight texels and output four pixels per clock. That’s a little weak compared to the competition; the GeForce 8600 has essentially double the per-clock texturing and render back-end capacity of the RV630.

The RV630 retains the R600’s basic cache structure, with separate L1 caches for textures and vertices, plus an L2 texture cache, but it halves the size of the L2 texture cache to 128KB. At 128 bits, the RV630’s path to memory is only a quarter the width of the R600’s, but it’s comparable to competing GPUs in this class.

Thanks to this crash diet, the RV630 is made up of an estimated 390 million transistors, down precipitously from the 600 million transistors packed into the R600. That still makes the RV630 a heavyweight among mid-range GPUs. The G84 GPU in the GeForce 8600 series is an estimated 289 million transistors and is manufactured on TSMC’s 80nm process. We’ve measured it at roughly 169 mm².


The RV630 GPU.

Alhough TMSC manufactures the RV630 on a smaller 65nm fab process, we measured it at about 155 mm². (If you’d like to see do a quick visual size comparison, we have a picture of the G84 in our GeForce 8600 review. All of our reference coins are approximately the same size as a U.S. quarter.)

The RV630’s partner in crime in the Radeon HD 2400 series is a featherweight, though.


Logical block diagram of the RV610 GPU. Source: AMD.

In order to bring it down to its diminutive size, AMD’s engineers chopped the RV610 to two shader SIMDs with just four execution units each, or 40 SPs in all. They left only one texture unit and one render back-end, so it can filter four texels and write out four pixels per clock. They also replaced the R600’s more complex vertex and texture cache hierarchy with a unified vertex/texture cache, and they reduced the memory path to 64 bits.

The result is a GPU whose 70 million transistors fit into a space only 7 mm by 10 mm—or 70 mm²—when manufactured at 65nm. Nvidia’s G86 GPU on competing GeForce 8300, 8400, and 8500 cards is larger in every measure, with 210 million transistors packed into a 132 mm² area via an 80nm process. Here’s a quick visual comparison of the two below. Sorry about the goo on the Radeon chips; it’s really hard to clean that stuff off, even with engine cleaner.


The G86 GPU.


The RV610 GPU.

The RV610 is smaller than the active portion of Sean Penn’s brain, yet it has a full DirectX 10 feature set. Well, almost full—the Radeon 2400 series’ multisampled antialiasing tops out at four samples, though it can add additional samples using custom tent filters that grab samples from neighboring pixels. Given the excellent image quality and minimal performance penalty we’ve seen from tent filters in the Radeon HD 2900 XT, that’s no great handicap.

The lineup
As you’ve probably heard, the Radeon HD 2900 XT didn’t deliver enough performance punch to knock the overall GPU performance crown off of Nvidia’s ever-expanding noggin. AMD didn’t even try to introduce an outright competitor to the GeForce 8880 GTX or Ultra, preferring to stick with the safe plan of offering a strong value at $399 to compete with the GeForce 8800 GTS. Since that development, many ATI/AMD fans have looked forward longingly to the launch of the Radeon HD 2600 series, expecting AMD to capture some glory in the form of the mid-range GPU crown. After all, AMD indicated it was aiming the Radeon HD 2600 XT at the $199 price point, where it would face the incumbent GeForce 8600 GTS. If the new Radeon could win that matchup, it would be a very compelling value in a graphics card for gamers.

However, as I’ve noted, AMD sent off warning signs as the Radeon HD 2400 and 2600 launch approached by trimming its projected prices. The Radeon HD 2600 line’s range dropped from $99-199 to $89-149, and the 2400 series went from “$99 and below” to “$85 and below.” That means, among other things, that AMD will have no answer to the GeForce 8600 GTS at around $199. Despite having a 100 million transistor advantage on the G84 GPU and comparable memory bandwidth, the RV630 evidently wasn’t up to the challenge.

AMD does seem committed to offering a compelling value where it can. I like this approach much better than the one ATI took with the Radeon X1600 XT, asking $249 for a graphics card that couldn’t match the competition’s $199 model. With the prices adjusted down, the initial low-to-mid-range Radeon HD lineup now looks like so:

GPU Core
clock (MHz)
Memory
clock (MHz)
Memory
interface
Price
range
Radeon HD 2400 Pro RV610 525 400-500 64 bits $50-55
Radeon HD 2400 XT RV610 700 800 64 bits $75-85
Radeon HD 2600 Pro RV630 600 500 128 bits $89-99
Radeon HD 2600 XT RV630 800 800-1100 128 bits $119-149

AMD and its partners will be offering two versions of the Radeon HD 2600 XT: one with GDDR3 memory clocked at 800MHz and another with GDDR4 memory clocked at 1100MHz. I’d expect the GDDR4 version to sell for closer to $149 and the GDDR3 version for closer to $119. (These are the sort of hard-hitting insights we deliver daily here at TR. Step back!)

We have several representatives from this lineup on hand.


The Radeon HD 2600 XT GDDR4

Here’s the Radeon HD 2600 XT, complete with a single-slot cooler and a set of connectors for internal CrossFire connections. Notice the absence of an auxiliary power plug. This puppy gets by on the 75W supplied by the PCIe slot alone. At nine inches, though, the 2600 XT is over an inch and a half longer than the GeForce 8600 GTS and over two inches longer than the 8600 GT.


The Radeon HD 2600 Pro

This 2600 Pro packs a small cooler and twin dual-link DVI ports, but there’s a notable omission: CrossFire connectors. Those wanting to build a multi-GPU config with the 2600 Pro will have to settle for passing data between the cards via PCI Express.


The Radeon HD 2400 XT

Oddly enough, the 2400 XT comes with a pair of CrossFire connectors, causing us some puzzlement. Why put ’em on this card and not on the 2600 Pro? Strange. The 2400 XT uses the same cooler as the 2600 Pro, but with its big cutout, the card itself is as tiny as the GPU onboard, relatively speaking.

We don’t have one, but I expect some 2400 Pro cards to be passively cooled, making them practically ideal for a home theater PC or similar device.

The competition
Figuring out the proper competitive matchups in the low end of the graphics card market is insanely tricky. Especially among Nvidia’s partners, card configurations and clock speeds tend to vary, prices can range widely for very similar products, and rebate deals can muddy the waters. That said, we can take a look at some street prices and get a sense of the market.

MSI GeForce 8500 GT card is currently selling for $74.99 at Newegg (plus a $10, ugh, mail-in rebate). Meanwhile, the XFX 8500 GT costs $79.99 at ZipZoomFly (plus a $20.00 mail-in rebate). Both cards run at Nvidia’s base clock speeds for the 8500 GT.

XFX offers multiple versions of the 8600 GT. The 540M variant, with a 540MHz core and 700MHz memory, is selling for $129 at TigerDirect and $134.99 at two other vendors. The 620M model has a 620MHz core and 800MHz memory. You can order it from Mwave for $136.97 and then send off for a $20 rebate. A host of other stores is selling this same card for $149.99 with the same rebate offer.

Finally, there’s the GeForce 8600 GTS. AMD has decided not to take this one on directly, but it’s still a notable presence in the market. XFX’s 730M variant has a 730MHz core and 1.13GHz memory, and it will set you back $224.99 at Newegg—not cheap. However, we can’t help but take notice of cards like this MSI 8600 GTS going for $164.99 at Newegg, plus a $10 rebate. The MSI’s 700/1050MHz clock speeds aren’t far off of the XFX card’s.


XFX’s GeForce 8500 GT and 8600 GT cards

So what do we make of all this? Here’s my best guess about how things will match up in the market once AMD’s new Radeons arrive in force. First, the GeForce 8600 GTS is positioned above anything in the Radeon HD 2600 series. That’s pretty clear. The closest competition for the Radeon HD 2600 XT GDDR4 is arguably cards like the XFX GeForce 8600 GT 620M, while the GDDR3 version of the 2600 XT will face off against the likes of the GeForce 8600 GT 540M.

From here, the waters get murkier. My sense is that the closest competition for the Radeon HD 2600 Pro will probably be the GeForce 8500 GT, although current prices put the 8500 GT closer to the Radeon HD 2400 XT’s projected list. I expect once things really shake out, the 2400 XT will end up doing battle against the GeForce 8400 GS for most of its lifetime.

With that in mind, we can set up the matchups you’ll see on the following pages. We’ve pitted the Radeon HD 2600 XT GDDR4 against XFX’s GeForce 8600 GT 620M in single-card performance. In SLI, we’ve added another XFX GeForce 8600 GT to the mix, but it’s the 540M model. Both cards in the pair will drop to its clock speed in order to work together. (Sorry, but we had to work with what we could get.)

The Radeon HD 2600 Pro will face off against the GeForce 8500 GT in single-card mode. We don’t have a second 2600 Pro, so we won’t have any CrossFire scores for it. Nonetheless, we’ve tested a pair of 8500 GT cards in SLI. Unfortunately, the MSI card in the pair lacks an SLI connector, so we’re doing SLI data transport via PCIe.

A couple of our contenders don’t have a direct competitor in the mix. The GeForce 8600 GTS showed up ready to fight, but AMD backed down. And we failed to snag a GeForce 8400 GS to test against the Radeon HD 2400 XT. Apologies for that. Just keep in mind that the presence of three cards from AMD and three from Nvidia doesn’t indicate three perfectly symmetrical price matchups, or reading our test results will be confusing.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core 2 Extreme X6800 2.93GHz Core 2 Extreme X6800 2.93GHz
System bus 1066MHz (266MHz quad-pumped) 1066MHz (266MHz quad-pumped)
Motherboard XFX nForce 680i SLI Asus P5W DH Deluxe
BIOS revision P26 1901
North bridge nForce 680i SLI SPP 975X MCH
South bridge nForce 680i SLI MCP ICH7R
Chipset drivers ForceWare 15.00 INF update 8.1.1.1010
Matrix Storage Manager 6.21
Memory size 4GB (4 DIMMs) 4GB (4 DIMMs)
Memory type 2 x Corsair TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
2 x Corsair TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
CAS latency (CL) 4 4
RAS to CAS delay (tRCD) 4 4
RAS precharge (tRP) 4 4
Cycle time (tRAS) 18 18
Command rate 2T 2T
Hard drive Maxtor DiamondMax 10 250GB SATA 150 Maxtor DiamondMax 10 250GB SATA 150
Audio Integrated nForce 680i SLI/ALC850
with Microsoft drivers
Integrated ICH7R/ALC882M
with Microsoft drivers
Graphics MSI GeForce 8500 GT 256MB PCIe
with ForceWare 158.45 drivers
Dual Radeon HD 2400 XT 256MB PCIe
with 8.83.9.1-070613a-048912E drivers
MSI GeForce 8500 GT 256MB PCIe +
XFX GeForce 8500 GT 450M 256MB PCIe
with ForceWare 158.45 drivers
Dual Radeon HD 2600 XT 256MB PCIe
with 8.83.9.1-070613a-048912E drivers
XFX GeForce 8600 GT 620M 256MB PCIe
with ForceWare 158.45 drivers
XFX GeForce 8600 GT 620M 256MB PCIe +
XFX GeForce 8600 GT 540M 256MB PCIe
with ForceWare 158.45 drivers
XFX GeForce 8600 GTS 730M 256MB PCIe
with ForceWare 158.45 drivers
Dual XFX GeForce 8600 GTS 730M 256MB PCIe
with ForceWare 158.45 drivers
Radeon HD 2400 XT 256MB PCIe
with 8.83.9.1-070613a-048912E drivers
Radeon HD 2600 Pro 256MB PCIe
with 8.83.9.1-070613a-048912E drivers
Radeon HD 2600 XT 256MB PCIe
with 8.83.9.1-070613a-048912E drivers
OS Windows Vista Ultimate x86 Edition Windows Vista Ultimate x86 Edition
OS updates

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Our test systems were powered by OCZ GameXStream 700W power supply units. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults.

The test systems’ Windows desktops were set at 1600×1200 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Sizing up the GPUs
I suppose I’ve already given away the game on performance by talking about the reasons why AMD decided to aim for lower prices on the eve of the Radeon HD 2400 and 2600 launch, but I still think the topic deserves some closer examination. Why is the R600 family underperforming? The answers have to do with some of the guesses AMD and Nvidia made about GPU usage models when they first set out to design these GPUs several years ago. AMD guessed differently than Nvidia about what mix of resources would be best to have onboard, and those guesses are embodied in the RV630 and RV610, as well as in the original R600.

These differences between AMD and Nvidia boil down to a few key metrics, which we can summarize and then measure with some simple tests in 3DMark. We’ll start with a table that shows theoretical peak throughput numbers.

Peak
pixel
fill rate
(Gpixels/s)
Peak texel
filtering
rate
(Gtexels/s)
Peak
memory
bandwidth
(GB/s)
Peak
shader
throughput
(MFLOPS)
GeForce 8400 GS 3.6 3.6 6.4 43.2
GeForce 8500 GT 3.6 3.6 12.8 43.2
GeForce 8600 GT 540M 4.3 8.6 22.4 114.2
GeForce 8600 GT 620M 5.0 9.9 25.6 130.1
GeForce 8600 GTS 5.4 10.8 32.0 139.2
Radeon HD 2400 Pro 2.1 2.1 6.4 42.0
Radeon HD 2400 XT 2.8 2.8 12.8 56.0
Radeon HD 2600 Pro 2.4 4.8 16.0 144.0
Radeon HD 2600 XT GDDR3 3.2 6.4 25.6 192.0
Radeon HD 2600 XT GDDR4 3.2 6.4 35.2 192.0

Let’s start with the right-most column, shader throughput. These numbers represent theoretical peaks for the programmable shader cores, ruling out fixed-function units like interpolators. Generally, what you’re looking at here is what happens if all of the GPU’s stream processors are occupied at once with the most optimal instruction mix—usually lots of multiply-add instructions, because they yield two operations per clock cycle. The obvious outcome here is that Radeon HD 2600 cards have a tremendous amount of peak shader throughput, with the 2600 XT easily surpassing the 8600 GT and even the 8600 GTS.

These numbers may even understate the case, because they’re assuming the GeForce 8 GPUs are able to co-issue a MADD and MUL in a single clock cycle, something that’s only possible in certain situations. If you discount this MUL, the GeForce chips’ peak throughput drops by a third—so the 8600 GTS peaks at 93 GFLOPS and the 8600 GT 620M peaks at 87 GFLOPS. Of course, there are counterpoints to be made by the Nvidia camp, not least of which involves the difficulty of consistently scheduling all five of the ALUs in the Radeons’ superscalar execution units with a full slate of work. The compiler in AMD’s drivers must sniff out dependencies ahead of time and schedule around them in order for the GPU to work properly. This issue will always be a challenge for the R600 and its relatives, but I am largely persuaded it won’t be a serious hindrance, in part because of the results of the tests we did here and in part due to the sheer amount of parallel processing power in these chips.

I’m also persuaded by our 3DMark shader test results, which tend to confirm the RV630’s shader prowess.

The Radeon HD 2600 XT beats out its ostensible direct competitor, the GeForce 8600 GT, in every test but the complex vertex shader one, and it’s close there. More notably, the 2600 XT outright creams even the 8600 GTS in the pixel shader and Perlin noise tests. (3DMark’s vertex shader tests sometimes seem not to max out shader throughput; the GeForce 8800 GTX has produced scores similar to the 8600 GTS in these tests, for whatever reason.) The long and the short of it is that the RV630 has quite a bit of shader power compared to the G84. The tiny RV610 also outdoes the G86 in the pixel shader, particles, and Perlin noise tests, but the gap is less pronounced there, as our theoretical throughput numbers suggested might be the case.

Look what happens when we consider theoretical peak pixel throughput and texturing, though. The Radeon HD 2600 XT tops out at 3.2 Gpixels/s of fill rate and 6.4 Gtexels/s of texture filtering capacity, while the GeForce 8600 GT 620M is substantially more capable, with peaks of 5 Gpixels/s and 9.9 Gtexels/s. The 2600 XT’s only strength here is memory bandwidth; it maxes out at over 35 GB/s, more than the 8600 GTS at 32 GB/s or the 8600 GT 620M at only 25.6 GB/s. Here’s what happens when we measure the more notable of these metrics, multitextured fill rate, in a simple synthetic test.

The 2600 XT comes in just behind the 8600 GT and well back from the 8600 GTS. Importantly, the 2600 XT is achieving something close to its theoretical peak throughput, likely due to its superior memory bandwidth. The 8600 GT and GTS, meanwhile, are keeping some power in reserve; they don’t reach their peaks in this simple test. Both have additional filtering capacity they might use in the right situation, like with the higher quality filtering we like to use in games, where textures can be fetched and cached in blocks. We found that the R600 tended not to scale as well as the G80 with higher degrees of anisotropy.

Finally, we have the question of antialiasing performance, which would traditionally be connected with pixel fill rate and the capacity of a GPU’s render back-ends or ROPs. For instance, have a look at this diagram of one of R600’s render back-ends created by AMD.


Logical block diagram of an R600 render back-end. Source: AMD.

The logic that handles the resolve step for multisampled antialiasing is shown here where it traditionally resides in a modern GPU, but there’s a catch. That diagram is something of a fib, like AMD’s insinuations that the R600 had UVD. In truth, the resolve step is programmable because it’s not handled in custom logic at all—in the R600 family, MSAA resolve is handled in the shader core. AMD says it has included a “a fast path between the render back-ends and the shader hardware” to allow the shaders to handle the resolve, and rightly argues that this provision can lead to higher image quality when combined with custom-programmed filters. Trouble is, this arrangement can also lead to lower performance. Dedicated logic tends to do jobs like traditional MSAA resolve quite well.

To give you some context, consider a claim AMD itself has made. The Radeon X1800 and X1900 series GPUs did filtering of 64-bit HDR-format textures in their shader cores, because their texturing filtering units couldn’t handle those datatypes. When AMD introduced the R600, whose filtering units can process 64-bit textures, it claimed a 7X speedup in HDR texture filtering performance. Of course, you won’t “feel” this one aspect of overall performance as a 7X speedup in a game, but that was the claim.

For a better sense of the impact of the RV610/RV630’s lack of MSAA resolve hardware, have a look at this table, which shows 3DMark performance for our contenders with and without 4X multisampled AA.

3DMark06
No AA
3DMark06
4X AA
Performance
penalty
GeForce 8500 GT 2189 1637 25.2%
GeForce 8600 GT 4938 3814 22.8%
GeForce 8600 GTS 5740 4512 21.4%
Radeon HD 2400 XT 2229 1512 32.2%
Radeon HD 2600 Pro 3378 2279 32.5%
Radeon HD 2600 XT 4888 3432 29.8%

The Radeon HDs suffer roughly an additional 7% penalty over their GeForce counterparts in the move to 4X AA. Worse yet, the 2600 XT nearly ties the GeForce 8600 GT without AA, but it falls behind 3432 to 3814 with 4X AA enabled.

The big story here is a simple one. AMD has biased its GPUs’ on-chip resources, particularly in the R600 and RV630, toward delivering vast amounts of shader power at the expense of texturing capacity and pixel throughput—especially when multisampled AA comes into the picture. Nvidia’s GeForce 8 chips strike a different balance.

The question of memory bandwidth gets to be a little more complicated, because it raises the issue of intentions. Had AMD followed through on its plans to sell the 2600 XT at $199 and kept its initial price structure intact, AMD and Nvidia would have been matched up almost exactly at several price points and pretty close across the board. As things now stand, AMD offers quite a bit more memory bandwidth at each price point. Of course, that means they’re probably paying more to make the cards at each price point, as well.

Will AMD’s gamble on shader power yet pay off? Time will tell, but I doubt the GPU usage model will change sufficiently in the life of these products. That statement’s hardly a gamble given the life cycles of GPUs these days, but I’m getting way ahead of myself once again. We should probably look at some results from today’s games before speculating any further.

Battlefield 2142
We tested BF2142 by manually playing a specific level in the game while recording frame rates using the FRAPS utility. Each gameplay sequence lasted 60 seconds, and we recorded five separate sequences per graphics card. This method has the advantage of simulating real gameplay quite closely, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

For quality settings, we chose BF2142’s “high quality” defaults, with a bump up in resolution to 1280×960. That means we tested with high quality texture filtering and lighting plus 2X multisampled antialiasing.

Things start out a little rough here for AMD’s mid-range card. The 2600 XT runs well behind the GeForce 8600 GT, offering only borderline playability. The 2600 XT’s performance scales up well in a CrossFire multi-GPU config, but not well enough to catch the 8600 GT in SLI. AMD’s lower end cards fare better, with the GeForce 8500 GT bracketed by the 2600 Pro and 2400 XT. Supreme Commander
Like many RTS and isometric-view RPGs, Supreme Commander isn’t exactly easy to test well, especially with a utility like FRAPS that logs frame rates as you play. Frame rates in this game seem to hit steady plateaus at different zoom levels, complicating the task of getting meaningful, repeatable, and comparable results. For this reason, we used the game’s built-in “/map perftest” option to test performance, which plays back a pre-recorded game.

Let’s get this out of the way up front: Supreme Commander isn’t a happy place for multi-GPU configs. SLI acts as a decelerator in this game, and CrossFire only delivers marginal performance benefits, at best. I’ve included the multi-GPU results for completeness, but we should focus on single-GPU performance. Once we do, we see that relative performance in SupCom looks a lot like what we saw in BF2142. The 8600 GT outperforms the 2600 XT, and the 8500 GT slots in between the 2600 Pro and 2400 XT.

The Elder Scrolls IV: Oblivion
For this test, we went with Oblivion’s default “high quality” settings and augmented them with 4X antialiasing and 16X anisotropic filtering, both forced on via the cards’ driver control panels. HDR lighting was enabled. Oblivion has higher quality settings than these, but the game looks pretty good with these options. We strolled around the outside of the Leyawin city wall, as show in the picture below, and recorded frame rates with FRAPS. This area has loads of vegetation, some reflective water, and some long view distances.

The 2600 XT looks relatively strong in Oblivion, nearly catching up to the GeForce 8600 GT. The combination of strong performance scaling in CrossFire and weak scaling with SLI allows the 2600 XT CrossFire config to trounce the 8600 GT SLI setup. On the other side of the tracks, the 2600 Pro is taking it to the GeForce 8500 GT. Meanwhile, the 2400 XT’s wimpy shader core probably holds it back in this game.

Rainbow Six: Vegas
This game is notable because it’s the first game we’ve tested based on Unreal Engine 3. As with Oblivion, we tested with FRAPS. This time, I played through a 90-second portion of the “Dante’s” map in the game’s Terrorist Hunt mode, with all of the game’s quality options cranked. That means HDR lighting and shader-based motion-blur effects were enabled. This game’s rendering engine isn’t compatible with traditional multisampled AA, so we had to do without.

R6: Vegas appears to be much friendlier ground for the 2600 XT; it vaults over the 8600 GT and even the 8600 GTS. Both the 2600 Pro and the 2400 XT outrun the 8500 GT, as well. Unfortunately, CrossFire performance puts a damper on things, slowing down the Radeons while the GeForces scale comparatively well in SLI.

3DMark06
I’ve already pretty much told you the story on 3DMark performance, but here’s a complete set of results at the benchmark’s default settings.

The 2600 XT is neck-and-neck with the 8600 GT across the entire range of resolutions. The 2600 Pro, meanwhile, is clearly more capable than the GeForce 8500 GT, which performs almost identically to the Radeon HD 2400 XT. As I’ve mentioned, the picture changes if we enable 4X antialiasing. Then, the 2600 XT drops behind the 8600 GT and the 2400 XT falls below the 8500 GT. The 2600 Pro, however, remains well ahead of the 8500 GT.

HD video playback – H.264
Next up, we have some high-definition video playback tests. We’ve measured both CPU utilization and system-wide power consumption during playback using a couple of HD DVD movies with different encoding types. The first of those is Babel, a title encoded at a relatively high ~25 Mbps with H.264/AVC. We tested playback during a 100-second portion of Chapter 3 of this disc and captured CPU utilization with Windows’ perfmon tool. System power consumption was logged using an Extech 380803 power meter.

We conducted these tests at 1920×1080 resolution on most of the cards, but we were surprised to discover something about our GeForce 8500 GT and 8600 GT cards from MSI and XFX: none of them support HDCP at all, even over a single DVI link, let alone dual. As a result, we had to test those cards at 1920×1440 resolution—still no scaling required—over an analog connection to a CRT monitor. The GeForce 8600 GTS and all of the Radeon HD cards worked perfectly with HDCP over a dual-link DVI connection to our Dell 30″ LCD.

Both the UVD logic in the Radeon HD 2400/2600 cards and the VP2 video processor in the GeForce 8500/8600 cards can accelerate H.264 decoding quite fully. We’ve also included a couple of high-end GPUs that lack UVD and VP2, to see how they compare.

All of the low-end and mid-range cards achieve substantially lower CPU utilization thanks to their H.264 decode capabilities. The high-end cards’ much higher scores drive that point home. The Radeon HD 2400 and 2600 cards do seem to consume a few more CPU cycles than their GeForce counterparts, though.

The story on power consumption is similar. The systems sporting GeForce 8500 and 8600 cards draw around 10 watts less than their Radeon HD competitors, but both GPU brands look to be very efficient overall. As a side note, the absence of UVD on the Radeon HD 2900 XT pretty much means what we expected: this GPU performs no better in HD video playback than its GeForce 8800 competition. In fact, the GeForce 8800 GTX consumes fewer CPU cycles while drawing the same amount of power.

HD video playback – VC-1
Unlike Babel, Peter Jackson’s version of King Kong is encoded in the VC-1 format that’s more prevalent among HD DVD movies right now. It’s also encoded at a more leisurely ~17 Mbps. The change in formats is notable because the bitstream processor in Nvidia’s VP2 unit can’t fully accelerate VC-1 decoding, while ATI’s UVD can. Nvidia downplays this difference by arguing that VC-1 is less difficult to decode anyhow, so the additional hardware assist isn’t necessary. Let’s see what kind of difference we’re talking about.

The Radeon HDs do indeed have an advantage over the GeForces in VC-1 playback, but it only amounts to about 5% less CPU utilization. Of course, that’s with a relatively fast 2.93GHz dual-core processor, and these cards will probably find their way into systems with slower CPUs, where the reduction in CPU load will be relatively larger. (Then again, with the way CPU prices have been going, I’m not so sure about that. If Intel follows through with its rumored quad-core price drop, the picture will change quite a bit.)

The Radeons’ more frugal use of CPU cycles with this VC-1 disc doesn’t really translate into a power advantage. The 2600 XT still draws over 10W more than the 8600 GT.

HD HQV video image quality
We’ve seen how these cards compare in terms of CPU utilization and power consumption during HD video playback, but what about image quality? That’s where the HD HQV test comes in. This HD DVD disc presents a series of test scenes and asks the observer to score the device’s performance in dealing with specific types of potential artifacts or image quality degradation. The scoring system is somewhat subjective, but generally, the differences are fairly easy to spot. If a device fails a test, it usually does so in obvious fashion. I conducted these tests at 1920×1080 resolution. Here’s how the cards scored.

Radeon HD
2400 XT
Radeon HD
2600 Pro
Radeon HD
2600 XT
GeForce
8500 GT
GeForce
8600 GT
GeForce
8600 GTS
HD noise reduction 0 25 25 0 0 0
Video resolution loss 20 20 20 20 20 20
Jaggies 0 20 20 0 10 10
Film resolution loss 25 25 25 0 0 0
Film resolution loss – Stadium 10 10 10 0 0 0
Total score 55 100 100 20 30 30

The Radeon HDs may have good reason for consuming a few more CPU cycles and a little more power than the GeForces in H.264 playback: they’re doing quite a bit more work in post-processing. Both of the RV630-based cards post perfect scores of 100, and their competition from Nvidia flunks out of the noise reduction and film resolution loss tests.

We could chalk up the GeForce cards’ poor scores here to immature drivers. Obviously, the current drivers aren’t doing the post-processing needed for noise reduction and the like. However, I received some pre-release ForceWare 162.19 drivers from Nvidia on the eve of this review’s release, which they claimed could produce a perfect score of 100 in HQV, and I dutifully tried them out.

Initially, I gave these new drivers a shot at 2560×1600, our display’s native resolution. With noise reduction and inverse telecine enabled, I found that our GeForce 8600 GT 620M stumbled badly in HD HQV, dropping way too many frames to maintain the illusion of fluid motion. After some futzing around, I discovered that the card performed better if I didn’t ask it to scale the video to 2560×1600. At 1920×1080, the 8600 GT was much better, but it still noticeably dropped frames during some HQV tests. Ignoring that problem, the 8600 GT managed to score 95 points in HD HQV. I deducted five points because its noise reduction seemed to reduce detail somewhat.

The faster GeForce 8600 GTS scored 95 points on HD HQV without dropping frames, even at 2560×1600. That’s good news, but it raises a disturbing question. I believe Nvidia is doing its post-processing in the GPU’s shader core, and it may just be that the 8600 GT is not powerful enough to handle proper HD video noise reduction. If so, Nvidia might not be able to fix this problem entirely with a driver update.

Also, even on the 8600 GTS, Nvidia’s noise reduction filter isn’t anywhere near ready for prime-time. This routine may produce a solid score in HQV, but it introduces visible color banding during HD movie playback. AMD’s algorithms quite clearly perform better.

Update 7/14/07: We originally said we tested HD HQV primarily at 2560×1600 resolution, but that’s inaccurate. We were unable to do so because of a bug in either ATI’s drivers or PowerDVD that prevented the Radeon HD cards from scaling video beyond 1920×1080. Due to this limitation, we tested all cards at 1920×1080. We’ve updated this page to reflect that fact. We have also inquired with ATI about the cause of the video upscaling problem and are awaiting an answer.

Power consumption
We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement.

The idle measurements were taken at the Windows desktop. The cards were tested under load running Oblivion at 1024×768 resolution with the game’s “high quality” settings, 4X AA, and 16X anisotropic filtering. We loaded up the game and ran it in the same area where we did our performance testing.

The Radeon HDs draw a little bit more power at idle than the GeForce cards, but they make up for it by pulling less juice when running Oblivion. Impressively, although the RV630 is a larger chip with more transistors, it draws less power than the G86. Noise levels and cooling
We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 14″ from the test system at a height even with the top of the video card. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including the Zalman CNPS9500 LED we used to cool the CPU. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

Our noise testing reveals two clear results, which our ears picked up quite readily, as well. First, the Radeon HD 2600 Pro is the loudest card of the lot, both at idle and under load. That little cooler is a bit overmatched for the RV630 GPU. Second, the largest cooler of the lot, on the 2600 XT, is also the quietest overall. The 2600 XT’s cooler is much larger than the GeForce 8600 GT’s, and true to form, that translates into less noise.

Conclusions
The gaming performance numbers bear out the wisdom of AMD’s decision to introduce the Radeon HD 2600 XT at $149 and to adjust everything below it downward accordingly. The 2600 XT GDDR4 model we tested performs competitively against the XFX GeForce 8600 GT 620M, but even in that company, it tends to stumble with antialiasing enabled. As for the two lower end cards, much depends on how AMD and Nvidia choose to position their products in the crowded portion of the market below $100. Right now, Nvidia’s GeForce 8500 GT looks to be bracketed in both price and performance by the Radeon HD 2600 Pro and Radeon HD 2400 XT. I’d call that a qualified win for the 2600 Pro, since stepping up to its $89-99 price only makes sense to me. If you’re spending less than that on graphics, you’re getting the sort of performance you deserve, regardless of which brand of GPU you pick. Beyond that, the new Radeon HD cards have some clear advantages in other departments, especially those features that fit under the Avivo HD umbrella. These GPUs’ native support for dual-link DVI ports with embedded HDCP crypto keys takes the guesswork out of connecting them to almost any sort of display you might choose. We found out about the perils of navigating these waters first-hand when we discovered none of our GeForce 8500 GT or 8600 GT cards support HDCP—and thus won’t play back HD movies over DVI. Radeon HD owners shouldn’t have to confront such surprises. The new Radeons’ support for HDMI with audio makes them nicely suited for home theater PCs, too.

Once you get those HD movies playing on your display of choice, the Radeon HD 2400 and 2600 offer the best overall combination of CPU offloading, power efficiency, and image quality available. The GeForce 8500/8600 chips’ inability to fully accelerate VC-1 decoding isn’t a big disadvantage in terms of additional CPU load or power consumption, but their poor scores in the HD HQV test raise concerns about image quality—as does, well, their image quality itself. In addition, the current state of post-processing in Nvidia’s pre-release drivers raises questions about whether cards like the GeForce 8600 GT will ever be up to the task of playing back HD movies with the sort of high-quality noise reduction the Radeon HD cards offer.

That may be little consolation for those who were hoping to see a killer DirectX 10-ready gaming card from AMD for around $200, a true replacement for the Radeon X1950 Pro. The X1950 Pro stands out as an excellent value still, but it’s growing increasingly difficult to recommend a DX9 card as a new purchase with the GeForce 8600 GTS in the mix. DX10 games are beginning to arrive, and eventually one or more of them will make that DX9 card feel old. Here’s the shame of it: the Radeon HD 2600 XT GPU packs about 100 million more transistors than the GeForce 8600 GTS, is built on a longer card with a larger cooler, and has more theoretical memory bandwidth and shader power. Yet it can’t keep pace with the 8600 GT all of the time, let alone the GTS, in current games. AMD’s aggressive pricing may make the 2600 XT a successful product and a reasonable choice for consumers, but it doesn’t entirely erase the sense of unrealized potential.

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!