Now, AMD is stirring things up again by introducing a new model of Athlon 64, the 3400+. Running at 2.2GHz, this CPU is very similar to the Athlon 64 FX-51, except that the 3400+ slides into a 754-pin socket and talks to only one channel of DDR400 memory. So the 3400+ doesn’t break new ground in terms of clock frequencies, but its introduction does suggest AMD is comfortable in its ability to produce enough 2.2GHz Athlon 64 processors to bring this speed grade to its higher-volume desktop platform.
We’re interested to learn several things about the 3400+. Its performance rating, for instance, suggests it’s faster than a theoretical Pentium 4 3.4GHz CPU. Can its performance back up that (implicit) claim? Also, how much difference is there between one memory channel and two? We’ve tested the Athlon 64 3400+ against its companions and competitors in an attempt to answer these questions, so read on.
In this corner…
Those of you not familiar with the Athlon 64 will want to read our initial review of the processor before continuing here. To refresh your memory, though, the Athlon 64 3400+ has a number of notable assets that help separate it from its predecessor, the Athlon XP. Among its virtues: an on-chip memory controller to cut memory access latency, a hefty 1MB of level 2 cache, support for SSE2 instructions, a radical system infrastructure based on high-speed HyperTransport links, and AMD’s 64-bit instruction set extensions. These changes have made the Athlon 64 a very tough competitor for the Pentium 4, even though Microsoft hasn’t yet delivered a 64-bit version of Windows.
Of course, we’d be remiss not to present some pictures of the Athlon 64 3400+. This particular chip, unlike our previous Athlon 64 and Opteron review units, sits on packaging dyed green, making it look very similar to a Pentium 4.
So that’s our subject today. Not much to look at, but what did you expect?
Interestingly enough, our A64 3400+ review unit showed us a less-than-cosmetic difference between itself and our A64 3200+ processor: the ability to run at CAS 2 with our Corsair test memory. Normally, we’d try to chalk up this difference to the chipset or some other external factor, but in this case, the memory controller is on the chip. AMD has no doubt made some revisions to the A64 over time, and it seems very possible the memory controller has been tweaked a bit. Whatever the case, our test results for the Athlon 64 3400+ use CAS 2 memory timings, and the 3400+ was entirely stable at CAS 2. Our A64 3200+ chip, however, was not, and we had to test it at CAS 2.5.
The other guys
The 3400+ is the star of the show today, but there are a couple of other Athlon 64 processors worth mentioning. For one thing, our benchmark results reflect a change in the Athlon 64 FX-51 system config. Corsair and other top memory makers have now produced low-latency memory for the Athlon 64 FX, registered DDR400 memory capable of running at a CAS latency of 2, so we’ve retested the FX with some. As a result, most of our Athlon 64 FX-51 scores are a little better than they were last time around, when we were running at CAS 2.5. That’s a noteworthy development, because the Athlon 64 FX-51 was already the fastest processor around.
On the other end of the spectrum, AMD recently introduced, very quietly, a relatively inexpensive version of the Athlon 64, the 3000+. This CPU is very similar to other Athlon 64 chips, except that it has only 512K of L2 cache on board, not the 1MB you’ll find in most Athlon 64 chips, and it runs at 2.0GHz. Most importantly, this puppy lists at only $218, or under a third what you’d pay for an Athlon 64 FX-51. That’s a heckuva price for a 2GHz “Hammer” processor, even with a smaller L2 cache.
Intrigued, we ordered up an Athlon 64 3000+ for testing from an online vendor, but the bums didn’t get it here in time for our article. We will have to update you on the A64 3000+’s performance numbers at a later date.
Now, on with the show…
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Processor||Athlon XP ‘Barton’ 3200+ 2.2GHz|| Athlon XP ‘Barton’ 2500+ 1.83GHz
Athlon XP ‘Barton’ 2800+ 2.183GHz
|AMD Athlon 64 3200+ 2.0GHz
AMD Athlon 64 3400+ 2.2GHz
|AMD Opteron 146 2.0GHz
AMD Athlon 64 FX-51 2.2GHz
| Pentium 4 2.4 ‘C’ GHz
Pentium 4 2.8GHz
Pentium 4 3.2GHz
Pentium 4 3.2GHz Extreme Edition
|Front-side bus||400MHz (200MHz DDR)||333MHz (166MHz DDR)||HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
|HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
|800MHz (200MHz quad-pumped)|
|Motherboard||Asus A7N8X Deluxe v2.0||Asus A7N8X Deluxe v2.0||MSI K8T Neo||MSI 9130||Abit IC7-G|
|North bridge||nForce2 SPP||nForce2 SPP||K8T800||K8T800||82875P MCH|
|South bridge||nForce2 MCP-T||nForce2 MCP-T||VT8237||VT8237||82801ER ICH5R|
|Chipset drivers||nForce Unified 2.45||nForce Unified 2.45||4-in-1 v.4.49
|INF Update 5.0.1015
|Memory size||1GB (2 DIMMs)||1GB (2 DIMMs)||768MB (3 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)|
|Memory type||Corsair TwinX XMS4000 DDR SDRAM at 400MHz||Corsair TwinX XMS4000 DDR SDRAM at 333MHz||Corsair XMS3200 DDR SDRAM at 400MHz||Corsair CMX512RE-3200LL PC3200 registered DDR SDRAM at 400MHz||Corsair TwinX XMS4000 DDR SDRAM at 400MHz|
|Hard drive||Seagate Barracuda V 120GB ATA/100||Seagate Barracuda V 120GB ATA/100||Seagate Barracuda V 120GB SATA 150||Seagate Barracuda V 120GB SATA 150||Seagate Barracuda V 120GB SATA 150|
|Audio||nForce2 MCP/ALC650||nForce2 MCP/ALC650||VT8237/ALC650||VT8237/ALC201A||ICH5/ALC650|
|Graphics||NVIDIA GeForce FX 5900 Ultra|
|OS||Microsoft Windows XP Professional|
|OS updates||Service Pack 1, DirectX 9.0b|
Sorry about the 768MB of RAM in the Athlon 64 3200+ and 3400+ system. I couldn’t get it to boot with either pair of 512MB DDR400 DIMMs I had on hand, and its motherboard had only three DIMM slots, so 768MB was as close as we could come. I don’t belive this difference in memory size should affect any of the benchmarks we used.
All tests on the Pentium 4 systems were run with Hyper-Threading enabled.
Thanks to Corsair for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering.
The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- Cachemem 2.65MMX
- SiSoft Sandra MAX3! (2003.7.9.73)
- Compiled binary of C Linpack port from Ace’s Hardware
- Discreet 3ds max 5.1 SP1
- NewTek Lightwave 7.5
- Cinebench 2003
- POV-Ray for Windows v3.5
- PICCOLOR v4.0 build 451
- SPECviewperf 7.1
- ScienceMark 2.0 beta (06SEP03-A build)
- Sphinx 3.3
- LAME 3.93.1 (build from mitiok.cjb.net)
- Xmpeg 5.0.1 with DivX Video 5.05
- Quake III Arena v1.31
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
Our customary synthetic memory benchmarks will start us off, and we can see how the A64 3400+’s single channel of DDR400 memory compares to the dual-channel solutions so common nowadays.
The results show a clear difference between the 3400+ and the dual-channel solutions. These synthetic benchmark scores may not translate directly into real-world performance, but they may be a primary reason for the existence of the dual-channel Athlon 64 FX.
The only real difference between the Athlon 64 3400+ and the A64 FX-51 is the number of memory channels, as Linpack demonstrates. The two processors perform identically until matrix sizes become large enough for main memory to matter. Note, though, how the A64 3400+’s on-die memory controller allows it to achieve much higher throughput than the Athlon XP 3200+. In fact, the 3400+ just barely falls behind the Pentium 4 3.2GHz, and then only at the largest matrix sizes.
Memory latency is the 3400+’s real strength. The dual-channel Athlon 64 FX requires registered DIMMs, and those add a cycle of latency to memory accesses. As a result, the 3400+ beats everything in our memory latency test. Notice, especially, the massive latency difference between the Athlon XP 3200+ and the Athlon 64 3400+, which run at the same 2.2GHz clock speed. This is one of the main reasons why AMD is now able to run with the Pentium 4 so well.
Let’s dwell on this point with some 3D graphs..
Not only are our 3D graphs indulgent, but they’re useful, too. I’ve arranged them manually in rough order from worst to best, for what it’s worth. I’ve also colored the data series according to how they correspond to different parts of the memory subsystem. Yellow is L1 cache, light orange is L2 cache, and orange is main memory. The red series on the Extreme Edition graph represents L3 cache. Of course, caches sometimes overlap, so the colors are just an interesting visual guide.
The A64 3400+ produces some very impressive latency numbers across the board. Let’s see how those translate into real-world performance.
Unreal Tournament 2003
The A64 3400+ darn near matches its big brother, the FX-51, in Unreal Tournament. The Pentium 4 3.2GHz trails well behind, and only the face-saving P4 Extreme Edition can, well, save face for Intel.
Quake III Arena
The 3400+ cranks out some mind-bending, bone-shattering Quake III frame rates, second only to the Pentium 4 Extreme Edition. The lower access latencies on the 3400+ seem to make up for the lower total memory bandwidth. Quake III Arena seems to play especially well with the P4 EE’s massive on-chip caches, running nearly 90 fps faster on the P4 EE than on the standard Pentium 4 3.2GHz.
Wolfenstein: Enemy Territory
Again in Wolfenstein: ET, the Athlon 64 chips take the top spots, interrupted only by the P4 EE.
Serious Sam SE
Whatever the game, the 3400+’s performance looks very good, as do the rest of the Athlon 64 processors.
Surprisingly, 3DMark03 shows the Pentium 4 on top for once. The individual CPU tests in 3DMark, however, tell a different story…
In these processor-oriented sub-tests, the A64 again comes out on top.
Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.
There are two goals with Sphinx. The first is to run it faster than real time, so real-time speech recognition is possible. The second, more ambitious goal is to run it at about 0.8 times real time, where additional CPU overhead is available for other sorts of processing, enabling Sphinx-driven real-time applications.
The 3400+’s single memory channel doesn’t prove to be much of a hindrance here, believe it or not. The P4s take the top spots in Sphinx, but the 3400+ just trails the Athlon 64 FX-51.
LAME MP3 encoding
We used LAME 3.92 to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:
lame –alt-preset extreme file.wav file.mp3
The Pentium 4 is hard to beat in media encoding, as it proves here.
DivX video encoding
Xmpeg is partially self-tuning, and it chose to use the SSE2 Optimized iDCT on the Hammer processors.
Even with SSE2 support, the 3400+ can’t keep up with the Pentium 4, falling one second behind the P4 2.8GHz.
3ds max rendering
We begin our 3D rendering tests with Discreet’s 3ds max, one of the best known 3D animation tools around. 3ds max is both multithreaded and optimized for SSE2. We rendered a couple of different scenes at 1024×465 resolution, including the Island scene shown below. Our testing techniques were very similar to those described in this article by Greg Hess. In all cases, the “Enable SSE” box was checked in the application’s render dialog.
The 3400+ splits the results with the Pentium 4 3.2GHz, outpacing it in one of the tests but not the other.
NewTek’s Lightwave is another popular 3D animation package that includes support for multiple processors and is highly optimized for SSE2. Lightwave can render very complex scenes with realism, as you can see from the sample scene, “A5 Concept,” below.
Lightwave uses SSE2 well enough that more threads don’t really help, or so it seems. All the results below are single-threaded.
The 3400+ virtually ties the Athlon 64 FX-51 in Lightwave, so memory bandwidth probably isn’t a big bottleneck here. The Pentium 4 responds especially well to Lightwave’s SSE2 options, and the P4 3.2GHz renders all three scenes faster than the 3400+.
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least. Don’t ask me whyseems crazy to me. POV-Ray also relies more heavily on x87 FPU instructions to do its work, because it contains only minor SIMD optimizations.
When old-school x87 FPU math is the name of the game, the Athlon 64 excels. The 3400+ finishes rendering the scene 40 seconds ahead of the P4 3.2GHz.
Cinebench 2003 rendering and shading
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration.
Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading. For the AMD-based systems, I’ve reported the single-processor results. For the P4 systems, I’ve reported the multi-threaded results, which in all cases were notably faster.
The Pentium 4 is much faster in Cinebench’s rendering tests. In the shading tests, however, things are a bit more evenly matched.
SPECviewperf workstation graphics
SPECviewperf simulates the graphics loads generated by various professional design, modeling, and engineering applications.
Although the A64 3400+ isn’t really a workstation-class processor, it doesn’t get embarrassed in viewperf. Several of the tests, though, including drv-09 and proe-02, obviously prefer the dual-channel memory configurations to the 3400+.
I’d like to thank Alex Goodrich for his help working through a few bugs the 2.0 beta version of ScienceMark. Thanks to his diligent work, I was able to complete testing with this impressive new benchmark, which is optimized for SSE, SSE2, 3DNow! and is multithreaded, as well.
In the interest of full disclosure, I should mention that Tim Wilkens, one of the originators of ScienceMark, now works at AMD. However, Tim has sought to keep ScienceMark independent by diversifying the development team and by publishing much of the source code for the benchmarks at the ScienceMark website. We are sufficiently satisfied with his efforts, and impressed with the enhancements to the 2.0 beta revision of the application, to continue using ScienceMark in our testing.
The molecular dynamics simulation models “the thermodynamic behaviour of materials using their forces, velocities, and positions”, according to the ScienceMark documentation. Sounds simple, right?
Primordia “calculates the Quantum Mechanical Hartree-Fock Orbitals for each electron in any element of the periodic table.” In our case, we used the default element, Argon.
The 3400+ rips through ScienceMark, taking second place to the Athlon 64 FX-51 in both the Primordia and Cipher AES tests. These last two tests, SGEMM and DGEMM, measure matrix math performance using several different codepath optimized with several instruction set extensions, including SSE, SSE2, and 3DNow!
Interesting. I had expected the 3400+ to be a fair bit slower than the Athlon 64 FX-51, because matrix multiplication generally requires quite a bit of memory bandwidth. However, the 3400+ nearly matches the FX-51 step for step. The Pentium 4 with SSE2 turns in the highest peak scores, as one might expect after having seen our 3D rendering test results. However, the Athlon 64 cranks out excellent performance almost regardless of the codepath, which is a rare and useful virtue.
picCOLOR image analysis
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his picCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. The overall score is indexed to a Pentium III 1GHz system based on a VIA Apollo Pro 133. In other words, the reference system would score a 1.0 overall.
As we’ve come to expect, the A64 3400+ nestles right in between the FX-51 and the 3200+.
Thanks to its on-chip memory controller and its ability to use unbuffered DIMMs, the 3400+ has the lowest memory access latencies we’ve ever seen. The benchmarks put the Athlon 64 3400+ just behind the Athlon 64 FX-51 in terms of overall performance, and I suppose AMD’s “3400+” model number is warranted, at least for gamers, for whatever that’s worth. As has often been the case, which chip is fastest depends quite a bit on what you want to do. The Athlon 64 3400+ pummels the Pentium 4 3.2GHz in most of our gaming benchmarks, although the P4 stills does relatively well in our media encoding, speech recognition, and SSE2-laden 3D rendering tests. Athlon 64 processors are strong across the board, though, with few real performance weaknesses.
The most interesting questions about the A64 3400+, however, aren’t strictly about its performance. Many enthusiasts will have a hard time forking over the cash to build a system based on AMD’s 754-pin socket. Socket 754 only allows for a single-channel memory configuration, and AMD has already made clear its intention to move all Athlon 64 products to a new 939-pin socket later this year. If you’re hoping to to upgrade your processor to a higher speed grade down the road, Socket 754 isn’t a very good bet. Then again, as fast as things move in motherboards, chipsets, and memory these days, many of us have just resigned ourselves to performing a motherboard upgrade along with each processor upgrade.
AMD has priced the 3400+ at $417, exactly at price parity, at least for now, with the Pentium 4 3.2GHz. The Athlon 64 FX-51 will remain at $733, making it an almost irrational purchase decision. The 3400+ is nearly as fast as the FX-51 in most applications, if not faster. The FX-51’s need for regisitered memory makes for two strikes against it: higher costs and higher latencies. Strike three, perhaps, is the need to purchase DIMMs in pairs because of the dual-channel config. Overall, the 3400+ is much more economical than the FX-51.
In fact, the 3400+ is the product that should finally push the Athlon 64 into the mainstream market. With its introduction, the Athlon 64 3200+ drops to $278, and the 3000+ slots in at a very affordable $218. Those prices mirror Intel’s prices for the Pentium 4 3.0GHz and 2.8GHz chips exactly. With these price cuts, the Athlon 64 has arrived in earnest. Now, if only we had a 64-bit version of Windows to run on it…