AMD’s Radeon HD 3850 and 3870 graphics cards

As you may know if you follow these things, AMD’s Radeon HD 2900 XT graphics processor, also known as the R600, wasn’t exactly a rousing success in all areas. The chip brought big gains in performance, features, and image quality, but it was late to market. When it arrived, it was fairly power-hungry and simply couldn’t match up to the performance of Nvidia’s top GeForce 8800 GPUs. Worse yet, the gaping hole between the $149 and $399 price points in the Radeon HD lineup left many enthusiasts wanting more.

Fortunately, graphics chips have relatively short lifespans, and the regular introductions of new chips bring ample opportunity for redemption. Today is one such opportunity, as AMD pulls the curtain back on its latest creation: a chip modeled on the R600 architecture that promises similar clock-for-clock performance, yet is under half the size and draws a fraction of the power. Better still, the new graphics cards based on this chip, the Radeon HD 3850 and 3870, look to be very affordable. That’s progress on all the right fronts, and it sets up PC gamers beautifully at a time when we’re seeing more good games released at once than, well, maybe ever.

Here’s the question: is this progress sufficient to allow AMD to catch up with Nvidia’s just-introduced and dauntingly formidable GeForce 8800 GT? Have the AMD guys engineered an even better stocking stuffer? Perhaps. Keep reading for some answers.


The RV670 GPU

If you’ll indulge me, I’d like to start out with the chip geekery, and then we can move on to the graphics cards themselves.

The subject of our attention today is the RV670 graphics processor, which is basically a revised version of the R600 that’s been converted to a smaller chip fabrication process and tweaked in numerous ways. The R600 itself was manufactured on a 80nm fab process, and it packed roughly 700 million transistors into a die area of 408 mm². That’s one big chip, and it was part of a growing trend in GPUs—or shall I say, a trend of growing GPUs. The things were adding transistors faster than Moore’s Law allows, so physical chip sizes were rising like crude oil prices.

This latest generation of GPUs is throwing that trend into reverse. Nvidia’s G92, which powers the GeForce 8800 GT, shoehorns an estimated 754 million transistors into a 324 mm² die (by my shaky measurements) via a 65nm process. The RV670 goes even further; its 666 million transistors occupy only 192 square millimeters, thanks to a 55nm fabrication process. The move to smaller chip sizes means several things, including cheaper chips, lower power consumption, and less heat production. In the case of the R670, AMD says the advantages of a 55nm process over a 65nm one are mainly in size and power. They can fit 30% more transistors into the same space with a 10% reduction in power use, but with no real increase in transistor switching speed over 65nm.

Here’s a quick visual on the chips, to give you a sense of relative size.



The RV670



Nvidia’s G92

You can see quite readily that the RV670 is a fairly small GPU—more of a wide receiver than a fullback like the G92. Yet the RV670 packs the same mix of 3D graphics processing units as the R600, including 320 stream processors, 16 texture units, and 16 render back-ends.

As you may have noticed, though, the RV670’s transistor count is down from the R600, despite a handful of new features. The primary reason for the reduction in transistors is that AMD essentially halved the R600’s memory subsystem for the RV670. Externally, that means the RV670 has a 256-bit path to memory. Internally, the RV670 uses the same ring bus-style memory architecture as the R600, but the ring bus is down from 1024 to 512 bits. Thus, the RV670 has half as many wires running around the perimeter of the chip and fewer ring stops along the way. Also, since the I/O portions of a chip like this one don’t shrink linearly with fabrication process shrinks, removing half of them contributes greatly to the RV670’s more modest footprint.

Of course, the obvious drawback to this move is a reduction in bandwidth, but we noted long ago that the R600 underperformed for a chip with its prodigious memory bandwidth. AMD says it has tweaked the RV670 to make better use of the bandwidth it does have by resizing on-chip buffers and caches and making other such provisions to help hide memory access latencies. On top of that, higher speed memories like GDDR4 should help offset some of the reduction in bus width.

The RV670 GPU — continued



A block diagram of the RV670 GPU. Source: AMD.

The RV670’s 3D graphics portions aren’t totally cloned from the R600. AMD has added some new capabilities and has even convinced Microsoft to update DirectX 10 in order to expose them to programmers. The upcoming DirectX 10.1 and Shader Model 4.1 add support for some RV670-specific features and for some capabilities of the R600 that weren’t available in DX10. These enhancements include, most notably, cube map arrays that allow multiple cube maps to be read and written in a single rendering pass, a capability AMD touts as essential for speeding up global illumination algorithms. DX10.1 also exposes much more direct control over GPU antialiasing capabilities to application developers, allowing them to create the sort of custom filters AMD provides with its Radeon HD drivers. Microsoft and AMD have worked to tighten up some requirements for mathematical precision in various stages of the rendering pipeline in DX10.1, as well, in addition to various other minor tweaks.

Because being DX10.1 compliant is an all-or-nothing affair, the RV670 is the world’s only DX10.1-capable GPU, at least for the time being. I wouldn’t get hung up on the differences between DX10 and 10.1 when selecting a video card, though. I’m pleased to see AMD moving the ball forward, especially on antialiasing, but these aren’t major changes to the spec. Besides, only now are we starting to see differences in game support between DX9’s Shader Models 2.0 and 3.0, and many of those differences are being skipped over in favor of moving wholesale to DX10. (I would be more optimistic about DX10.1 support becoming a boon for Radeon HD 3800-series owners at some point down the line were it not for the fact that nearly every major game I’ve installed in the past two months has come up with an Nvidia logo at startup. That says something about who’s investing in the sort of developer-relations programs that win support for unique GPU features.)

Speaking of arcane capabilities changes, here’s an interesting one for you: RV670 adds the ability to process double-precision floating-point datatypes. This sort of precision isn’t typically needed for real-time graphics, but it can be very useful for non-graphics “stream computing” applications. AMD says the RV670 handles double-precision math at between a quarter and a half the speed of single-precision math, which isn’t bad, considering the application. In fact, they’ve already announced the RV670-based FireStream 9170 card.

So the RV670 includes mojo for many markets. One of those markets is mobile computing, and this time around, AMD is pulling in a mobile-oriented feature to make its desktop chips more power-efficient. The marketing name for this particular mojo is “PowerPlay,” which wraps up a number of power-saving measures under one banner. PowerPlay is to GPUs like Intel’s SpeedStep is to CPUs. At the heart of the mechanism is a microcontroller that monitors the state of the GPU’s command buffer in order to determine GPU utilization. With this info, the controller can direct the chip to enter one of several power states. At low utilization, the GPU remains in a relatively low-power state, without all of the 3D bits up and running. At high utilization, obviously, the GPU fires on all cylinders. The RV670 also has an intermediate state that AMD calls “light gaming” where some of portions of the graphics compute engine are active, while the rest are disabled in order to save power. PowerPlay can also scale core and memory clock speeds and voltages in response to load. These things are handled automatically by the chip, and niftily, AMD has included a GPU utilization readout in its driver control panel.



The Overdrive section of AMD’s control panel now includes a GPU utilization readout

We will, of course, test the RV670’s power consumption shortly.

The RV670 improves on the R600 in a couple of other areas. One of those is its support for HD video playback. AMD’s UVD video decoder logic is fully present in the RV670, unlike the R600, so the RV670 can do most of the heavy lifting required for playback of high-definition video encoded with H.264 and VC-1 codecs. We’ve tested UVD’s performance with Radeon HD 2400 and 2600 series cards and found that those cards couldn’t scale video to resolutions beyond native 1080p (1920×1080). AMD claims the RV670 has sufficient bandwidth and shader power to scale movies up to 2560×1600 resolution, if needed.

To help make that possible, the RV670 includes support for HDCP over dual-link DVI connections and, like the R600, has a built-in digital audio controller it can use to pass sound over an HDMI connection. AMD offers a DVI-to-HDMI converter, as well.

The last bit of newness in the RV670 is the addition of PCI Express 2.0 connectivity. PCIe 2.0 effectively doubles the throughput of PCIe connections, with very little drama. PCIe 2.0 devices like the RV670 remain backward-compatible with older motherboards.

We’ve heard very little in the way of hype for PCIe 2.0, but AMD expects the faster interconnect to become quite useful when it enables “CrossFire X” via new video drivers slated for this coming January. When combined with the new RD790 chipset, RV670-based video cards will be able to run in two, three, or four-way configurations, and the four-way config would involve four expansion slots fed by eight PCIe 2.0 lanes each. In order to make such madness feasible, the RV670’s CrossFire interconnect has been boosted to double the pixel rate per connector, so that only a single physical connector is needed for dual-card CrossFire configs. The second connector on each card could be used for some daisy-chaining action in three- and four-way setups.

AMD is betting in a big way on CrossFire, and plans to address some long-standing complaints with the tech in upcoming drivers. One planned change is the ability to support multiple displays “seamlessly,” a capability one might have expected from the beginning out of multi-GPU acceleration. AMD’s Overdrive utility now allows the overclocking of multiple GPUs, too. The biggest change, though, will be this one: rather than producing another high-end chip to replace the Radeon HD 2900 XT at the top of its lineup, AMD plans to introduce the Radeon HD 3870 X2 this winter, a dual-GPU-on-a-stick card reminiscent of the GeForce 7950 GX2. Given the challenges to multi-GPU performance scaling we’ve seen lately, even with only two GPUs, I’m not sure what to think of this new emphasis on CrossFire. The history of the 7950 GX2, after all, is not a happy one. Time will tell, I suppose.

The cards

Now that I’ve filled your head with ethereal bits of RV670 theory, here’s a look at the hardware.



The Radeon HD 3850

This thin little number is the Radeon HD 3850, the lower end of the two RV670-based graphics cards. This puppy will feature a 670MHz GPU core and 256MB of GDDR3 memory running at 830MHz (or 1.66GHz effective data rate). AMD says this board uses 95W of power and rates its cooler at 31 dBA.



The Radeon HD 3870 sports a dual-slot cooler

And this beefier specimen is the Radeon HD 3870. Unlike the GeForce 8800 GT, this beast packs a dual-slot cooler, which is both a curse and a blessing. Yes, it eats more slots, but it also exhausts hot air out the back of the case and should be able to provide more cooling with less noise than a single-slot design. Aided by this cooler, the 3870 reaches core speeds of 775MHz, and it runs its 512MB of GDDR4 memory at 1125MHz. The big cooler belies small things, though, including a rated board power of 105W and rated noise of 34 dBA.



Both 3800-series cards have dual CrossFire connectors



Even the 3870 needs only a single six-pin PCIe power connection

Both cards have twin dual-link DVI connectors and HDTV-out ports. They’re pretty much, uh, modern video cards as expected.

One nice touch about AMD’s new naming scheme: no suffixes like “XT” and “Pro” anymore. In the 3800 series, only numbers denote higher or lower performance, so things are much easier to decode. The folks in AMD graphics seem to have picked up this idea from the AMD CPU people, interestingly enough. Imagine that.

Doing the math—and the accounting

Those of you who are familiar with this GPU architecture may be jumping ahead. With a 775MHz core clock and only a 256-bit memory interface, how will the Radeon HD 3870 match up to the GeForce 8800 GT? Let’s have a look at some of the key numbers side by side, to give you a hint. Then we’ll drop the bomb.

Peak
pixel
fill rate
(Gpixels/s)

Peak bilinear

texel
filtering
rate
(Gtexels/s)


Peak bilinear

FP16 texel
filtering
rate
(Gtexels/s)


Peak
memory
bandwidth
(GB/s)

Peak
shader
arithmetic
(GFLOPS)
GeForce 8800 GT 9.6 33.6 16.8 57.6 504
GeForce 8800 GTS 10.0 12.0 12.0 64.0 346

GeForce 8800 GTX

13.8 18.4 18.4 86.4 518
GeForce 8800 Ultra 14.7 19.6 19.6 103.7 576
Radeon HD 2900 XT 11.9 11.9 11.9 105.6 475
Radeon HD 3850 10.7 10.7 10.7 53.1 429
Radeon HD 3870 12.4 12.4 12.4 72.0 496

Here are some of the key metrics for various enthusiast-class cards. We already know that the 3800 series’ ostensible competition, the GeForce 8800 GT, pretty much comprehensively outperforms the Radeon HD 2900 XT. As you can see, the HD 3870 slightly surpasses the 2900 XT in terms of pixel and texture throughput and peak shader capacity, but doesn’t have the same mammoth memory bandwidth. And, notably, the HD 3870 trails the 8800 GT in terms of texturing capacity and shader arithmetic. (Be aware that simple comparisons between GPU architectures on shader throughput are tricky. Another way of counting would reduce the GeForce 8-series cards’ numbers here by a third, and justifiably so.)

AMD apparently looked at these numbers, thought long and hard, and came to some of the same conclusions we did: doesn’t look like the 3870’s gonna perform quite as well as the 8800 GT. So here are some additional numbers for you: the Radeon HD 3850 should show up at online retailers today for $179, as should the HD 3870 at $219. This presents an interesting situation. The first wave of 8800 GTs largely sold out at most places, and prices rose above Nvidia’s projected “$199 to $249” range as a result. If AMD can supply enough of these cards and keep prices down, they may offer a compelling alternative to the 8800 GT, even if they’re not quite as fast overall. That certainly seems to be the hope in the halls of the former ATI. Whether that will come to pass, well, I dunno. Let’s see how these things actually perform.

Test notes

You may notice that I didn’t engage in a lot of GPU geekery this time around. I decided instead to focus on testing these new video cards across a range of the amazing new games and game engines coming out this fall. These GPUs are basically “refresh” parts based on existing technology, and their most compelling attributes, in my view, are their incredibly strong price-performance ratios.

In order to give you a better sense perspective on the price-performance front, I’ve included a couple of older video cards, in addition to a whole range of new cards. Roughly a year ago, the Radeon X1950 Pro faced off against the GeForce 7900 GS at $199. This year’s crop of similarly priced GPUs have some substantial advantages in terms of specifications and theoretical throughput, but as you’ll see, the gains they offer in real-world performance are even larger—and they do it while delivering image quality that’s sometimes quite noticeably superior to last year’s models, as well.

You’ll find results for both the X1950 Pro and the 7900 GS in several of our gaming tests and in our power and noise measurements. I’ve had to limit their participation to scripted benchmarks because these cards were generally too slow to handle the settings at which we tested manually with FRAPS. Also, please note that these older cards are using DirectX 9 in Crysis, since they can’t do DX10.

That leads me to another issue. As I said, the older cards couldn’t handle some of the settings we used because, well, they’re quite intensive, with very high resolutions, quality levels, or both. We tested at these settings because we wanted to push the cards to their limits in order to show meaningful performance differences between them. That’s hard to do without hitting a CPU or system-level bottleneck, especially with cards this fast running in multi-GPU configurations. We did test at multiple quality levels with a couple of games in order to give you a sense of performance scaling, which should help. But please don’t take away from this review that a card like the Radeon HD 3850 can’t run most of these games at more common settings. Quite the opposite is true, which is why this new breed of cards is a nice fit for the new wave of games coming out. Most folks won’t run at 2560×1600 resolution, of course. We intentionally pushed the boundaries in order to tease out performance differences.

Also, please note that many of the GeForce cards in the tables below are clocked at higher-than-stock speeds. Nvidia’s board vendors have made a practice of selling their products at multiple clock speeds, and some of our examples are these hot-clocked variants. For instance, the 8800 GTS cards are all clocked at 575MHz (or in the case of the one XFX 320MB card, 580MHz) core clocks and correspondingly higher shader clocks. Obviously, that’s going to change the performance picture. We think it makes sense to include these cards because they’re typically fairly plentiful and available for not much of a premium over stock-clocked versions. They’re what we might buy for ourselves.

The one exception to that rule, at least right now, may be the GeForce 8800 GT. The first wave of these cards looks to have sold out at many online vendors, and all variants are going for something of premium right now—especially the higher clocked ones. We have included one “overclocked” version of the 8800 GT (from MSI) in our tests in order to show you its performance. This card is very fast, but be aware that it is not currently a $199 or even a $249 option.

Finally, in the graphs, Ive highlighted the results for the Radeon HD 3800 series cards in bright yellow so they’re easy to spot. I’ve also highlighted the GeForce 8800 GT in pale yellow, so the closest competition is easier to compare.



Gigabyte’s X38-DQ6 served as our new CrossFire test platform

Our testing methods

As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Core
2 Extreme X6800
2.93GHz
Core
2 Extreme X6800
2.93GHz
System
bus
1066MHz
(266MHz quad-pumped)
1066MHz
(266MHz quad-pumped)
Motherboard XFX
nForce 680i SLI
Gigabyte
GA-X38-DQ6
BIOS
revision
P31 F5h
North
bridge
nForce
680i SLI SPP
X38
MCH
South
bridge
nForce
680i SLI MCP
ICH9R
Chipset
drivers
ForceWare
15.08
INF
update 8.3.1.1009

Matrix Storage Manager 7.6

Memory
size
4GB
(4 DIMMs)
4GB
(4 DIMMs)
Memory
type
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
2
x Corsair
TWIN2X20488500C5D
DDR2 SDRAM
at 800MHz
CAS
latency (CL)
4 4
RAS
to CAS delay (tRCD)
4 4
RAS
precharge (tRP)
4 4
Cycle
time (tRAS)
18 18
Command
rate
2T 2T
Audio Integrated
nForce 680i SLI/ALC850

with RealTek 6.0.1.5497 drivers

Integrated
ICH9R/ALC889A

with RealTek 6.0.1.5497 drivers

Graphics XFX
GeForce 7900 GS 480M 256MB PCIe

with ForceWare 169.01 drivers

Dual

Radeon X1950 Pro 256MB PCIe

with 8.43 drivers

Dual
XFX
GeForce 7900 GS 256MB PCIe

with ForceWare 169.01 drivers

Dual

Radeon HD 2900 XT 512MB PCIe

with 8.43 drivers

GeForce
8800 GT 512MB PCIe

with ForceWare 169.01 drivers

Dual
Radeon HD 3850 256MB PCIe

with 8.43 drivers

Dual
GeForce
8800 GT 512MB PCIe

with ForceWare 169.01 drivers

MSI
NX8800 GT TD512E 512MB PCIe

with ForceWare 169.01 drivers

XFX
GeForce 8800 GTS XXX 320MB PCIe

with ForceWare 169.01 drivers

XFX
GeForce 8800 GTS XXX 320MB PCIe
+ MSI
NX8800GTS OC 320MB PCIe

with ForceWare 169.01 drivers

EVGA
GeForce 8800 GTS SC 640MB PCIe

with ForceWare 169.01 drivers

Dual
EVGA
GeForce 8800 GTS SC 640MB PCIe

with ForceWare 169.01 drivers

MSI
GeForce 8800 GTX 768MB PCIe

with ForceWare 169.01 drivers

Dual
GeForce 8800
GTX 768MB PCIe

with ForceWare 169.01 drivers


Radeon X1950 Pro 256MB PCIe

with 8.43 drivers


Radeon HD 2900 XT 512MB PCIe

with 8.43 drivers


Radeon HD 3850 256MB PCIe

with 8.43 drivers


Radeon HD 3870 256MB PCIe

with 8.43 drivers

Hard
drive
WD
Caviar SE16 320GB SATA
OS Windows
Vista Ultimate
x86 Edition
OS
updates
KB36710, KB938194, KB938979, KB940105,
DirectX August 2007 Update

Thanks to Corsair for providing us with memory for our testing. Their quality, service, and support are easily superior to no-name DIMMs.

Our test systems were powered by PC Power & Cooling Silencer 750W power supply units. The Silencer 750W was a runaway Editor’s Choice winner in our epic 11-way power supply roundup, so it seemed like a fitting choice for our test rigs. Thanks to OCZ for providing these units for our use in testing.

Unless otherwise specified, image quality settings for the graphics cards were left at the control panel defaults. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

Enemy Territory: Quake Wars

We’ll start with Quake Wars since this game’s simple “nettimedemo” allows us to record a gaming session and play it back with precise repeatability on a range of cards at a range of resolutions. Which is what we did. A lot.

We tested this game with 4X antialiasing and 16X anisotropic filtering enabled, along with “high” settings for all of the game’s quality options except “Shader level” which was set to “Ultra.” We left the diffuse, bump, and specular texture quality settings at their default levels, though, to be somewhat merciful to the 256MB and 320MB cards. Shadows, soft particles, and smooth foliage were enabled where possible, although the Radeon X1950 Pro wasn’t capable of handling soft particles.

Mommy.

Sooo.. much…. data.

Where to start? I suppose by pointing out that we didn’t test the Radeon HD 3870 in CrossFire because AMD only supplied us with a single card, despite our best efforts. We’ll try to get a second one soon.

Beyond that, the HD 3800-series cards come out looking reasonably good here. The HD 3850 utterly trounces the GeForce 8600 GTS, Nvidia’s lower-end offering with 256MB of memory. The “overclocked” version of the 8600 GTS we tested is selling for $169.99 at Newegg, but it’s clearly outclassed by the HD 3850. Last year’s models, the GeForce 7900 GS and Radeon X1950 Pro, are also quite a bit slower than the HD 3850. The 3850 doesn’t fare too well in CrossFire, though, where it struggles to keep pace with a single HD 3870 and just doesn’t appear to scale well.

As for the Radeon HD 3870, it shadows the GeForce 8800 GT from a distance of about five FPS at our two higher resolutions. That’s pretty close, and the HD 3870 is delivering eminently acceptable frame rates at 1600×1200 with 4X AA and 16X aniso—no mean feat. Although it has substantially less memory bandwidth than the Radeon HD 2900 XT, the 3870 performs almost exactly the same, even up to 2560×1600 resolution. Perhaps that 512-bit memory interface was overkill, ya think?

Crysis demo

Crytek has included a GPU benchmarking facility with the Crysis demo that consists of a fly-through of the island in which the opening level of the game is set, and we used it. For this test, we set all of the game’s quality options at “high” (not “very high”) and set the display resolution to—believe it or not—1280×800 with 4X antialiasing. Even that was a little bit rough on some of the cards, so we tried again with antialiasing disabled and the game’s post-processing effects set to “Medium.” At these lower settings, we expanded the field to include some older and lower-end graphics cards, to see how they compare.

The HD 3850 and the GeForce 8800 GTS 320MB—especially the GeForce—both seem to suffer here because of their smaller amounts of onboard memory. Beyond that, the results are mixed. The HD 3870 again performs almost exactly like the 2900 XT. With 4X antialiasing and high-quality post-processing enabled, the HD 3870 hits the same median low score as the GeForce 8800 GT, though with a slower average. Without AA and enhanced post-processing, though, the HD 3870 trails the GT significantly.

By the way, I’ve excluded multi-GPU configs from this test because the Crysis demo and these driver revisions don’t appear to get along. Nvidia has just released some SLI-capable drivers, and we’re expecting a patch for the full game to enable better multi-GPU support. We’ll have to follow up with results from the full game later.

Unreal Tournament 3 demo

We tested the UT3 demo by playing a deathmatch against some bots and recording frame rates during 60-second gameplay sessions using FRAPS. This method has the advantage of duplicating real gameplay, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

Because the Unreal engine doesn’t support multisampled antialiasing, we tested without AA. Instead, we just cranked up the resolution to 2560×1600 and turned up the demo’s quality sliders to the max. I also disabled the demo’s frame rate cap before testing.

Once more, the HD 3870 does its best 2900 XT impression, turning in very similar frame rates. The 3870 is a little slower than the 8800 GT here, but you’ll be hard-pressed to feel the difference subjectively.

The poor HD 3850 is a bit overmatched at this resolution, which is to be expected from a 256MB graphics card. Pardon my impulse to stress things. To keep things in perspective, the HD 3850 will run the UT3 demo as smooth as glass at 1920×1200 with these same quality settings, averaging 52 frames per second and hitting a low of 30 FPS.

Call of Duty 4

This game is about as sweet as they come, and we also tested it manually using FRAPS. We played through a portion of the “Blackout” mission at 1600×1200 with 4X antialiasing and 16X aniso.

Here the HD 3870 again performs pretty much identically to the Radeon HD 2900 XT. Are we detecting a pattern? (Lightbulb appears in thought balloon. Ding!) Unfortunately for AMD, that’s not enough performance to keep up with the 8800 GT.

Meanwhile, the HD 3850 continues to struggle with CrossFire scaling. The 2900 XT doesn’t scale well here, either, though.

TimeShift

This game may be a bizarrely derivative remix of Half-Life 2 and F.E.A.R., but’s it’s a guilty-pleasure delight for FPS enthusiasts that has a very “action-arcade” kind of feel to it. Like most of the other games, we played this one manually and recorded frame rates with FRAPS. We had all of the in-game quality settings maxed out here, save for “Projected Shadows,” since that feature only works on Nvidia cards.

Another game shows us a similar pattern. The HD 3870 is fast, but not as fast as the 8800 GT—yet it’s delivering what feels to me like playable performance at 1920×1200 with 16X aniso. The HD 3850 can’t quite handle this resolution as gracefully.

BioShock

We tested this game with FRAPS, just like we did the UT3 demo. BioShock’s default settings in DirectX 10 are already very high quality, so we didn’t tinker with them much. We just set the display res to 2560×1600 and went to town. In this case, I was trying to take down a Big Daddy, a generally unsuccessful effort.

The HD 3870 again slots into things about where you’d expect, and the cards with less than 512MB of RAM onboard again suffer here due to my penchant for testing at high resolutions. I had to exclude the Radeon HD 3850 CrossFire here because it was painfully slow—like three frames per second. I believe this was just a memory size issue. Dropping to a lower resolution did seem to help.

Team Fortess 2

For TF2, I cranked up all of the game’s quality options, set anisotropic filtering to 16X, and used 4X multisampled antialiasing at 2560×1600 resolution. I then hopped onto a server with 24 players duking it out on the “ctf_2fort” map. I recorded a demo of me playing as a soldier, somewhat unsuccessfully, and then used the Source engine’s timedemo function to play the demo back and report performance.

Unfortunately, I wasn’t able to complete my TF2 testing before Valve pushed out an update over Steam and rendered my recorded demo incompatible the latest version of the game. I decided to go ahead and give you what results I have, even though the Radeon HD 3870 isn’t included. I really like Steam, but I sure don’t like its weak-to-useless user-side change control.

Moving on….

Power consumption

We measured total system power consumption at the wall socket using an Extech power analyzer model 380803. The monitor was plugged into a separate outlet, so its power draw was not part of our measurement. The cards were plugged into a motherboard on an open test bench.

The idle measurements were taken at the Windows Vista desktop with the Aero theme enabled. The cards were tested under load running BioShock in DirectX 10 at 2560×1600 resolution, using the same settings we did for performance testing.

Check that out! Although the Radeon HD 3870 performs almost exactly like the 2900 XT, it does so while drawing vastly less power. Our HD 3870-equipped test system draws 98W less than our otherwise-identical 2900 XT-based rig does while running BioShock. That’s a massive honking reduction in power consumption versus AMD’s previous-gen chip. Not only that, but the HD 3870 system pulls 21W less at idle and 39W less under load than the GeForce 8800 GT system.

Noise levels

We measured noise levels on our test systems, sitting on an open test bench, using an Extech model 407727 digital sound level meter. The meter was mounted on a tripod approximately 14″ from the test system at a height even with the top of the video card. We used the OSHA-standard weighting and speed for these measurements.

You can think of these noise level measurements much like our system power consumption tests, because the entire systems’ noise levels were measured, including the stock Intel cooler we used to cool the CPU. Of course, noise levels will vary greatly in the real world along with the acoustic properties of the PC enclosure used, whether the enclosure provides adequate cooling to avoid a card’s highest fan speeds, placement of the enclosure in the room, and a whole range of other variables. These results should give a reasonably good picture of comparative fan noise, though.

These measurements probably aren’t precise enough that it’s worth worrying over minor differences like tenths of a decibel. The things to take away from this are the big differences, because those are the ones you’ll notice. Such as: the Radeon HD 2900 XT is rather loud while running games, as is the GeForce 7900 GS. The Radeon HD 3850 and 3870, meanwhile, are both as quiet as they come.

Conclusions

AMD has made tremendous strides with this generation of GPUs. The Radeon HD 3870 delivers almost exactly the same performance as the Radeon HD 2900 XT, yet the chip is under half the size and brings an astounding near-100W reduction in power use while gaming. Since less power is expended as heat, the HD 3870 can be vastly quieter, as well. Honestly, I didn’t expect these sorts of efficiency gains from what is essentially still an R600-derived design—and it certainly appears now that AMD overshot in a major way when it gave the 2900 XT a 512-bit memory interface. That error has been corrected, and R600’s core architecture suddenly looks much nicer than it had before.

AMD needed every inch of that progress in order to come within shouting distance of the outstanding GeForce 8800 GT. Fortunately, they’ve done it, and the Radeon HD 3870 looks like a reasonable alternative, provided AMD can make good on its $219 price target. No, the HD 3870 isn’t as fast as the GeForce 8800 GT, but we tested the latest games at very high resolutions and it still achieved some decent frame rates in today’s games—except for, you know, Crysis, which bogs down any GPU. If you’re using a display with a more common resolution like 1600×1200, 1680×1050, or 1920×1200, the HD 3870 will typically allow you to turn up the eye candy and still get fluid performance. Some folks will probably be willing to take that deal and pocket the difference in price versus the 8800 GT. I can’t say I’d blame them.

And, as complex as these GPUs are, the issues really do boil down to price, performance, and adequacy at the end of the day. The DirectX 10 spec has firmed up the requirements for image quality, and partially as a result, the latest Radeons and GeForces produce very similar, great-looking output. Although I think AMD has something of an edge in terms of HD video playback filtering and noise reduction in some ways, even that is largely a wash. Most HD movies are going to look gorgeous on either card, regardless. I do plan to test HD video playback on these new cards soon, so we can get a closer look, though.

I’m not sure what to make of the Radeon HD 3850. The price is right, and it certainly delivers an awful lot of GPU power for the money. Yet the 256MB memory size in many “enthusiast value” graphics cards increasingly feels like a mismatch with the available processing power as GPU throughput increases. The HD 3850’s memory size may not prevent folks from having a good experience in most of today’s games, especially if they’re playing at display resolutions of 1600×1200 or less. But some games have higher memory requirements than others, and such requirements are always growing as newer games arrive. Features like antialiasing and anisotropic filtering require additional memory, as well. Owners of the HD 3850 may have to turn down some settings and compromise on image quality in order to play some games smoothly, even when the GPU has power to spare.

The precise impact of that compromise is hard to gauge, but I can say with confidence that the HD 3850 is a poor choice for use in CrossFire configurations. Adding a second card can nearly double your effective GPU power, but it does nothing for your effective memory size, which remains the same as with one card. In fact, the overhead for coordinating with another GPU in CrossFire probably consumes some video memory, which may be why CrossFire was actually slower on the HD 3850 in some cases, yet faster on the 2900 XT. That makes an HD 3850 CrossFire rig a pretty extreme mismatch between GPU power and video RAM.

I should mention that budget-minded folks who like the idea of a GPU-RAM size mismatch will have another option soon in the form of the GeForce 8800 GT 256MB for between $179 and $199. It’s no accident that range starts at the HD 3850’s list price, of course, and given what we’ve seen today, the 8800 GT 256MB ought to be faster than the HD 3850. I’d really rather have either company’s 512MB card, though, personally.

The next big question, I suppose, is how pricing and availability of 8800 GT and Radeon HD 3800-series graphics cards will play out over the next little while. I don’t think there are any bad choices here, especially among the 512MB cards. AMD will need to maintain its price advantage, though, in order to offer as compelling a product as the 8800 GT.

Comments closed

Pin It on Pinterest

Share This

Share this post with your friends!