We’ve had two versions of this CPU on the bench in Damage Labs for a while now: the screaming 4800+ that may be the fastest single microprocessor on the planet, and the gotta-have-it 4200+, AMD’s most affordable dual-core processor that promises to be every enthusiast’s new sweetheart. Can AMD’s dual-core desktop processors deliver on their promise? Keep reading.
Introducing the X2
As you might expect, the Athlon 64 X2 is simply the desktop version of the dual-core Opteron that we reviewed recently. It’s the same basic chip, just in a different package. In fact, the package will look very familiar to those of you who have seen a Socket 939 processor before. Here, in keeping with TR tradition, are a couple of big pictures of the CPU.
A couple of big pictures of the Athlon 64 X2 4200+ AMD says the Athlon 64 X2 should fit into existing motherboards with only a BIOS update, provided that the motherboard can deliver enough power to drive a current 90nm Athlon 64 processor. Existing Socket 939 CPU heat sink/fan combos should be sufficient to cool an X2, as well. As always, you’ll want to check with your motherboard maker before assuming your mobo will work with the new chip. There are always exceptions.
Given the performance of the dual-core Opterons, the X2’s common heritage should come as heartening news. The desktop version of AMD’s dual-core processor will come in four flavors featuring two different clock speeds, 2.2GHz and 2.4GHz, and two L2 cache sizes, 512K per core and 1MB per core. The entire lineup will look like so:
|CPU||Clock speed||L2 cache size||Price|
|Athlon 64 X2 4200+||2.2GHz||512KB||$537|
|Athlon 64 X2 4400+||2.2GHz||1024KB||$581|
|Athlon 64 X2 4600+||2.4GHz||512KB||$803|
|Athlon 64 X2 4800+||2.4GHz||1024KB||$1001|
The X2 line will range from expensive to painfully expensive to root-canal-without-anesthetic expensive. Unlike Intel, AMD will not initially be offering a relatively cheap dual-core processor that steps on the toes of its current single-core offerings in the meaty part of the market. All of the X2 chips are priced above the Athlon 64 4000+, and they get higher model numbers, as well.
Speaking of model numbers, AMD has apparently foregone a perfect opportunity to throw out its “clock speed equivalent” rating system that’s increasingly less relevant over timeespecially now that dual cores are the order of the day. Take the Athlon 64 X2 4800+, for example. The 4800+ is literally a pair of K8 cores running at the same clock speed as a single Athlon 64 4000+, but it only gets a model number increment of 800. Too modest? Perhaps, but who’s to say? Does a bear pope in the woods? A question can be hard to answer when it makes no sense. AMD would have done well to abandon its True Performance Initiative now that the shift to thread-level parallelism has thrown clock speed-based performance estimates out the window once and for all. Somehow, the company that once looked like it was tilting at windmills trying to get Intel to acknowledge that clock speeds aren’t a faithful indicator of CPU performance continues to beat on the windmill now that it lies broken on the ground.
Anyhow, the price-competitive scenarios between Intel and AMD CPUs look like this. The Athlon 64 X2 4800+ will match up directly against the Pentium Extreme Edition 840 at about a thousand bucks, while the X2 4200+ will stand toe to toe with the Pentium D 840 at around $530-ish. AMD’s X2 4400+ and 4600+ models will face little direct competition from Intel, and Intel’s low-end Pentium D 820 will face an asymmetric threat from AMD’s single-core chips like the Athlon 64 3400+. These will be strange times, indeed.
AMD will not be abandoning high-performance, single-core processors once the X2 arrives, either. The Athlon 64 FX will get at least one more refresh this summer, aimed at gamers who want the highest possible performance in their single-threaded entertainment. Longer term, multi-core processors are no doubt the future, but AMD is moving more cautiously and conservatively into dual-core desktop parts than Intel.
One of the more intriguing questions about AMD’s plans for the X2 has to do with its availability. AMD’s official word on the matter now is that the Athlon 64 X2 will be “available in June,” but when we visited AMD’s Austin, Texas offices in March to talk about its dual-core product plans, we got an unexpected lesson in the anatomy of a “rolling product launch.”
Here’s the plan as they communicated it to us, stage by stage. The Athlon 64 X2 would first be announced at the time that dual-core Opterons were unveiled (and it was). Some time after that, reviews would happen (that’s today). Next, there would be an official product launch. At that time, the first products would become available. (One could surmise that this day will come in June.) Initially, during the third quarter of the year, X2s would be sold primarily to OEMs and smaller system builders in Europe, as well as to system builders in the United States. After that, in the fourth quarter, AMD would turn its focus toward selling retail X2 processors in the U.S.
Obviously, initial availability will be sketchy, as it is now for Intel’s Pentium Extreme Edition 840. Both AMD and Intel are rushing to get their dual-core products into the hands of reviewers well ahead of the time when the processors will be available in volume. We’ve seen such launch tactics applied innumerable times before, but rarely have we seen it mapped out in such exquisite detail. What we still don’t know, however, is when A64 X2 processors will become available for eager PC enthusiasts to purchase via major online vendors like Newegg.com. Could be June; could be December. I suppose that depends less on what AMD has planned than on what AMD can manage to deliver. If so, only time will tell.
As I’ve said, the Athlon 64 X2 is the same basic chip as the dual-core Opteron, so it shares the same internal architecture. Here is a fancy looking but wildly simplified block diagram of that design.
The X2’s two CPU cores share a single, unified system request queue and a crossbar that connects them to the on-chip memory controller and HyperTransport link for I/O. This arrangement should allow the processor’s two cores optimal use of available resources without too much contention. The cores themselves are able to communicate with one another through the system request interface. Cache coherency updates and any data transfers between the two cores’ caches will happen over this high-speed, on-chip data path.
Despite what you see in the diagram above, the Athlon 64 X2 has only one HyperTransport link, because it will only be used in single-socket systems. The pricier Opterons get more than one link for use in multi-socket configs. That leaves the Athlon 64 X2 with 6.4GB/s of peak theoretical memory bandwidth and 8GB/s of peak theoretical I/O throughput. At 14.4GB/s total, that’s well more than the 6.4GB/s peak throughput of Intel’s 800MHz front-side bus.
Because Intel’s dual-core Smithfield chip has no internal data links between its two cores, all memory accesses, system I/O, and cache coherency updates must happen over its shared front-side bus. That leaves the Athlon 64 X2 with a sizeable bandwidth advantage, at least in theory.
The two chips are very comparable in terms of size and transistor count, though. With 1MB of L2 cache per core, the X2code-named “Toledo” on AMD’s roadmapspacks roughly 230 million transistors into a die that’s 199 mm2. Intel’s Smithfield is strikingly similar at about 233 million transistors and 206 mm2.
The Athlon 64 X2’s two cores are both endowed with all of the updates that AMD included in its recent revision E of the K8 architecture. According to AMD, the changes in the E-step chips include the addition of SSE3 instructions, the ability to host mismatched DIMMs on a memory channel with little performance penalty, better memory loading so that a full house of DIMMs won’t be a drag on performance, and improved memory mapping.
I don’t think that’s the whole story, however.
In our testing, we’ve found that AMD’s 90nm chips have faster L2 caches, as demonstrated here. We’ve also found that the revision E cores perform quite a bit better clock for clock, especially in memory-intensive tasks. That leads me to believe that AMD has implemented some of the other features expected to come along with SSE3 support, perhaps including enhanced data prefetch, additional write-combining buffers, and the ability to convert the LEA instruction to an ADD in certain cases. I tried to shake some more details about these changes out of AMD but wasn’t able to get many specifics. You’ll see the effect in our benchmark results, though, when the older-rev Athlon 64 FX-55, with a markedly faster memory subsystem, struggles to keep pace with revision E Opteron 152. The X2 chips also perform relatively better clock for clock in single-threaded apps than one might otherwise expect.
Like the revision E chips, the Athlon 64 X2 is manufactured with AMD’s 90nm process using silicon-on-insulator (SOI) technology. In addition, AMD has optimized its (much larger) dual-core chips to consume no more power and generate no more heat than its single core parts by tweaking manufacturing techniques. The relatively lower power consumption comes at the expense of clock speed, but obviously the tradeoff isn’t huge, since the X2 tops out at just 200MHz less than the Athlon 64 FX series currently does. Between the rev-E performance increases, the power optimizations, and the presence of two cores on one chip, the Athlon 64 X2’s performance per clock and per watt should be a sizeable advance over the CPUs AMD was selling just months ago.
We have focused our testing today on the question of thread-level parallelism, in part because we believe that is the most important performance question one can explore in relation to multi-core processors. However, we are excited about the possibilities for better multitasking that may come with dual-core CPUs, and we’d be glad to take your suggestions for testing multitasking scenarios.
Also, we have included results for the Pentium D 840 in our testing, which we obtained by disabling Hyper-Threading on our Extreme Edition 840. Since the Pentium D 840 is just an Extreme Edition 840 sans HT, the numbers should be valid.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Processor||Opteron 148 2.2GHz
Opteron 152 2.6GHz
Opteron 175 2.2GHz
Dual Opteron 248 2.2GHz
Dual Opteron 252 2.6GHz
Dual Opteron 275 2.2GHz
|Xeon 3.2GHz (Nocona 1MB)
Xeon 3.4GHz (Nocona 1MB)
Dual Xeon 3.2GHz (Nocona 1MB)
Dual Xeon 3.4GHz (Nocona 1MB)
| Pentium 4 660 3.6GHz
Pentium D 840 3.2GHz
Pentium Extreme Edition 840 3.2GHz
|Pentium 4 Extreme Edition 3.73GHz||Athlon 64 3800+ 2.4GHz (Venice)
Athlon 64 4000+ 2.4GHz
Athlon 64 FX-55 2.6GHz
Athlon 64 X2 4200+ 2.2GHz
Athlon 64 X2 4800+ 2.4GHz
|System bus||1GHz HyperTransport||800MHz (200MHz quad-pumped)||800MHz (200MHz quad-pumped)||1066MHz (266MHz quad-pumped)||1GHz HyperTransport|
|Motherboard||Tyan Thunder K8WE S2895||SuperMicro X6DAL-G||Intel D955XBK||Intel D955XBK||Asus A8N-SLI Deluxe|
|BIOS revision||2/21/2005 beta||080010||BK95510J.86A.1152||BK95510J.86A.1234||MCT2/dualcore|
|North bridge||nForce Professional 2200
nForce Professional 2050
AMD 8131 PCI-X Tunnel
|Intel E7525||955X MCH||955X MCH||nForce4 SLI|
|Chipset drivers||SMBus driver 4.45
IDE driver 4.75
|OS integrated||INF Update 220.127.116.119||INF Update 18.104.22.1689||SMBus driver 4.45
IDE driver 4.75
|Memory size||2GB (4 DIMMs)||2GB (4 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)|
|Memory type||OCZ PC3200 512MB registered ECC DDR SDRAM at 400MHz||Kingston PC3200 512MB registered ECC DDR DRAM at 333MHz||Corsiar XMS2 5400UL DDR2 SDRAM at 533MHz||Corsiar XMS2 5400UL DDR2 SDRAM at 667MHz||Corsair XMS Pro 3200XL DDR SDRAM at 400MHz|
|CAS latency (CL)||3||2.5||3||4||2|
|RAS to CAS delay (tRCD)||3||3||2||2||2|
|RAS precharge (tRP)||3||3||2||2||2|
|Cycle time (tRAS)||8||7||8||8||5|
|Hard drive||Maxtor DiamondMax 10 250GB SATA 150|
with NVIDIA 4.60 drivers
with Realtek 22.214.171.12420 drivers
with SigmaTel 5.10.4456.0 drivers
with SigmaTel 5.10.4456.0 drivers
with Realtek 126.96.36.19920 drivers
|Graphics||GeForce 6800 Ultra 256MB PCI-E with ForceWare 71.84 drivers|
|OS||Windows XP Professional x64 Edition|
Note that we have more total memory on the workstation-class setups. I don’t believe any of our benchmarks are constrained by available RAM in a 1GB system, but you’ll still want to keep the difference in mind.
All tests on the Pentium systems were run with Hyper-Threading enabled, except where otherwise noted.
Thanks to Corsair, OCZ, and Kingston for providing us with memory for our testing. This matchup required lots of high-quality RAM, so we had to spread the love around. All three brands are far and away superior to generic, no-name memory.
The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- SiSoft Sandra 2005 SR1 10.50 64-bit
- ScienceMark 2.0 64-bit
- Compiled binary of C Linpack port from Ace’s Hardware
- POV-Ray for Windows 3.6 64-bit
- SMPOV 4.3
- 3ds max 7.0
- Cinebench 2003
- LAME MT 3.97a 64-bit
- Xmpeg 5.0.3 with DivX Video 5.21
- Windows Media Encoder 9
- Sphinx 3.3
- picCOLOR v4.0 build 545 64-bit
- DOOM 3 1.1 with trdelta1 demo
- Far Cry 1.3 with tr3-pier demo
- Unreal Tournament 2004 v3355 with trdemo1
- 3DMark05 v120
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
We generally start out with some memory subsystem tests, so we can see how the processors match up on that front. These results sometimes help us to understand some of the later benchmark results from real applications.
The Athlon 64’s on-die memory controller gives it a very low latency path to memory, and the X2 is no exception. This low-latency connection also yields lots of memory bandwidth, giving the X2 a bandwidth edge over the Pentium Extreme Edition 840 and the Pentium D. In the Linpack graph, we can “see” the size and performance of the CPUs’ different cache configurations. The Pentium 4 processors with 2MB of L2 cache really stand out here, of course. Notice also that the Athlon 64 X2 4800+, which runs at 2.4GHz, matches the Athlon 64 FX-55 2.6GHz in L2 cache bandwidth. Like I said, AMD’s 90nm chips appear to have faster L2 caches, clock for clock. You can also see how the X2 4200+’s smaller cache slows its performance with data sets of certain sizes.
Up next are some gaming tests, which will essentially serve to illustrate the futility of running a dual-core processor in a single-threaded application. Notice that we’ve included above each result a little graph generated by the Windows Task Manager as the benchmark ran on our dual Opteron 275 system (with four total CPU cores.) This should give you some indication of the amount of threading in the application. In some cases with single-threaded apps like the games below, the task will oscillate back and forth between one CPU and the next, but total utilization generally won’t go above 50% for a dual-core or 25% for a quad-core (or quad-front-end, in the case of the XE 840 with Hyper-Threading) system.
We tested performance by playing back a custom-recorded demo that should be fairly representative of most of the single-player gameplay in Doom 3.
Our Far Cry demo takes place on the Pier level, in one of those massive, open outdoor areas so common in this game. Vegetation is dense, and view distances can be very long.
Unreal Tournament 2004
Our UT2004 demo shows yours truly putting the smack down on some bots in an Onslaught game.
The Athlon 64 X2’s gaming performance is outstanding, probably more because of AMD’s enhancements to its rev-E cores than any benefit from having a second CPU core onboard. Despite a 200MHz clock speed handicap, the Athlon 64 X2 4800+ outperforms the Athlon 64 FX-55 in two of the three gaming benchmarks above.
3DMark05’s overall score is almost entirely dictated by the limitations of our GeForce 6800 Ultra graphics card, but the multithreaded CPU test in 3DMark05 is another story. The X2 processors perform especially well here.
POV-Ray just recently made the move to 64-bit binaries, and thanks to the nifty SMPOV distributed rendering utility, we’ve been able to make it multithreaded, as well. SMPOV spins off any number of instances of the POV-Ray renderer, and it will bisect the scene in several different ways. For this scene, the best choice was to divide the screen up horizontally between the different threads, which provides a fairly even workload.
With four threads taking full advantage of Hyper-Threading it’s fairly close, but the Athlon 64 X2 4800+ turns in a faster render time than the Pentium XE 840. Similarly, the X2 4200+ beats out the Pentium D 840 here.
We tested 3ds max performance by rendering 20 frames of a sample scene at 320×240 resolution. This particular scene makes use of a motion-blur effect that requires extensive multi-pass rendering. We tried two different renderers: 3ds max’s default scanline renderer and its built-in version of the mental ray renderer.
If anything, the X2’s lead over the dual-core Pentiums is more pronounced in 3dsmax. Note that mental ray refuses to make use of all four CPU cores on the Opteron 275 due to licensing issues.
Cinema 4D’s rendering engine does a very nice job of distributing the load across multiple processors, as the Task Manager graph shows.
Impressively, the X2 4800+ edges out the Pentium XE 840 in Cinema 4D’s renderer, traditionally something of an Intel stronghold in our benchmark suite. The 4200+ follows suit, outpacing the Pentium D 840. As with all of the rendering tests, the 4200+ also positively obliterates its more expensive single-core peers like the Athlon 64 FX-55 and the P4 Extreme Edition 3.73GHz.
Cinebench’s shading tests are all single threaded, and it shows. The X2s perform respectably but are nothing special here.
LAME MT is, as you might have guessed, a multithreaded version of the LAME MP3 encoder. LAME MT was created as a demonstration of the benefits of multithreading specifically on a Hyper-Threaded CPU like the Pentium 4. You can even download a paper (in Word format) describing the programming effort.
Rather than run multiple parallel threads, LAME MT runs the MP3 encoder’s psycho-acoustic analysis function on a separate thread from the rest of the encoder using simple linear pipelining. That is, the psycho-acoustic analysis happens one frame ahead of everything else, and its results are buffered for later use by the second thread. The author notes, “In general, this approach is highly recommended, for it is exponentially harder to debug a parallel application than a linear one.”
We have results for two different 64-bit versions of LAME MT from different compilers, one from Microsoft and one from Intel, doing two different types of encoding, variable bit rate and constant bit rate. We are encoding a massive 10-minute, 6-second 101MB WAV file here, as we have done in our previous CPU reviews.
Multithreading produces some nice performance gains on all of the processors able to take proper advantage of it, and both X2 models match up well against the competing Pentiums as a result.
We used the Xmpeg/DivX combo to convert a DVD .VOB file of a movie trailer into DivX format. Like LAME MT, this application is only dual threaded.
Windows Media Encoder video encoding
We asked Windows Media Encoder to convert a gorgeous 1080-line WMV HD video clip into a 640×460 streaming format using the Windows Media Video 8 Advanced Profile codec.
The X2 processors would appear to have few weaknesses, as they match or surpass the competing Pentium processors in video encoding, another customary Intel strength.
We’re using the 64-bit beta version of ScienceMark for these tests, and several of its components are multithreaded. ScienceMark author Alexander Goodrich says this about the Molecular Dynamics simulation:
Molecular Dynamics is lightly multithreaded – one thread takes care of U/I aspects, and the other thread takes care of the computation. The computation itself is not multithreaded, though Tim and I were looking into ways of changing the algorithm to support multi-threading programming a couple years ago – it’s a lot of effort, unfortunately. When MD [is] running there [is] a total of 2 threads for the process.
Here are the results:
The Primordia test “calculates the Quantum Mechanical Hartree-Fock Orbitals for each electron in any element of the periodic table.” Alex says this about it:
Primordia is multithreaded. Two main tasks occur which allow this to happen. Essentially, we identified 2 parallel tasks that could be done. We could probably take this a step further and optimize it even more. There is an issue, however, with the Pentium Extreme Edition that we’ve identified. The second computation thread gets executed on the logical HT thread rather than the 2nd core, so performance isn’t as good as it could be. This will be fixed in the next revision. This doesn’t effect [sic] the regular Pentium D. A workaround could include disabling HT on Pentium EE. There are 3 threads for primordia – 2 threads for computation, 1 thread for U/I.
The next two tests are only single-threaded, and they don’t make as good use of any of the CPUs here as they could if they were better optimized. The ScienceMark team has plans to incorporate linear algebra libraries from Intel and AMD in order to boost performance.
All told, the X2 does well here, too. Only dual Opteron 252s are faster than the X2 4800+ in Primordia and Moldyn. The BLAS SGEMM test doesn’t look too good for any of the AMD processors, but the tables turn in the double-precision DGEMM benchmark. Truth be told, none of the CPUs are performing up to potential in the unoptimized versions of SGEMM and DGEMM included here.
Next up is SiSoft’s Sandra system diagnosis program, which includes a number of different benchmarks. The one of interest to us is the “multimedia” benchmark, intended to show off the benefits of “multimedia” extensions like MMX and SSE/2. According to SiSoft’s FAQ, the benchmark actually does a fractal computation:
This benchmark generates a picture (640×480) of the well-known Mandelbrot fractal, using 255 iterations for each data pixel, in 32 colours. It is a real-life benchmark rather than a synthetic benchmark, designed to show the improvements MMX/Enhanced, 3DNow!/Enhanced, SSE(2) bring to such an algorithm. The benchmark is multi-threaded for up to 64 CPUs maximum on SMP systems. This works by interlacing, i.e. each thread computes the next column not being worked on by other threads. Sandra creates as many threads as there are CPUs in the system and assignes [sic] each thread to a different CPU.
We’re using the 64-bit port of Sandra. The “Integer x16” version of this test uses integer numbers to simulate floating-point math. The floating-point version of the benchmark takes advantage of SSE2 to process up to eight Mandelbrot iterations at once.
The Pentium Extreme Edition 840 proves impossible for the X2s to beat in this highly optimized test of peak SSE2 performance. Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance. However, the versions of Sphinx we’re using are only single-threaded.
These Sphinx results are one reason I think that AMD has added enhanced data prefetch to the newer K8 cores. The Athlon 64 3800+ at 2.4GHz beats out the Athlon 64 FX-55, as does the X2 4200+ at 2.2GHz. Overall, though, the Pentium 4 remains the champ in Sphinx.
picCOLOR was created by Dr. Reinert H. G. Müller of the FIBUS Institute. This isn’t Photoshop; picCOLOR’s image analysis capabilities can be used for scientific applications like particle flow analysis. Dr. Müller has supplied us with new revisions of his program for some time now, all the while optimizing picCOLOR for new advances in CPU technology, including MMX, SSE2, and Hyper-Threading. Naturally, he’s ported picCOLOR to 64 bits, so we can test performance with the x86-64 ISA.
At our request, Dr. Müller, the program’s author, added larger image sizes to this latest build of picCOLOR. We were concerned that the thread creation overhead on the tests rather small default image size would overshadow the benefits of threading. Dr. Müller has also made picCOLOR multithreading more extensive. Eight of the 12 functions in the test are now multithreaded.
Scores in picCOLOR, by the way, are indexed against a single-processor Pentium III 1GHz system, so that a score of 4.14 works out to 4.14 times the performance of the reference machine.
Once more, the X2 asserts its excellence in a multithreaded test.
We measured the power consumption of our entire test systems, except for the monitor, at the wall outlet using a Watts Up PRO watt meter. The test rigs were all equipped with OCZ PowerStream 520W power supply units. The idle results were measured at the Windows desktop, and we used SMPOV and the 64-bit version of the POV-Ray renderer to load up the CPUs. In all cases, we asked SMPOV to use the same number of threads as there were CPU front ends in Task Managerso four for the dual Opteron 252, four for the Pentium XE 840, two for the Opteron 175, and so on.
The graphs below have results for “power management” and “no power management.” That deserves some explanation. By “power management,” we mean SpeedStep or PowerNow/Cool’n’Quiet. (In the case of the Pentium 4 600-series processors and the XE 840, the C1E halt state is always active, even in the “no power management” tests.) Sadly, the beta BIOS we used for our Tyan S2895 motherboard didn’t support AMD’s PowerNow, so we couldn’t report scores for the Opterons with power management enabled. Similarly, the beta BIOS for our Asus A8N-SLI Deluxe mobo wouldn’t support Cool’n’Quietwhich is PowerNow with a different nameon the Athlon 64 X2 processors. AMD says all of its dual-core chips will support power management once the proper BIOS support becomes available.
As advertised, both X2 models deliver power consumption comparable to their single-core predecessors, at least according to these system-level numbers. Unfortunately, the Pentium D and XE chips don’t fare so well. The systems based on these chips suck up over 100W more than the systems based on the competing X2 processors.
Let’s start by talking about the Athlon 64 X2 4200+. This CPU generally offers better performance than its direct competitor from Intel, the Pentium D 840. Most notably, the X2 4200+ doesn’t share the Pentium D’s relatively weak performance in single-threaded tasks like our 3D gaming benchmarks. The Athlon 64 X2 4200+ also consumes less power, at the system level, than the Pentium D 840just a little bit at idle (even without Cool’n’Quiet) but over 100W under load. That’s a very potent combo, all told. In fact, the X2 4200+ frequently outperforms the Pentium Extreme Edition 840, which costs nearly twice as much. Thanks to its dual-core config, the X2 4200+ also embarrasses some expensive single-core processors, like the Athlon 64 FX-55 and the Pentium 4 Extreme Edition 3.73GHz. Personally, I don’t think there’s any reason to pay any more for a CPU than the $531 that AMD will be asking for the Athlon 64 X2 4200+.
If you must pay more for some reason, the Athlon 64 X2 4800+ will give you the best all-around performance we’ve ever seen from a “single” CPU. The X2 4800+ beats out the Pentium Extreme Edition 840 virtually across the board, even in tests that use four threads to take best advantage of the Extreme Edition 840’s Hyper-Threading capabilities. The difference becomes even more pronounced in single-threaded applications, including games, where the Pentium XE 840 is near the bottom of the pack and the X2 4800+ is constantly near the top. The X2 4800+ also consumes considerably less power, both at idle and under load.
The X2 4800+ gives up 200MHz to its fastest single-core competitor, the Athlon 64 FX-55, but gains most of the performance back in single-threaded apps thanks to AMD’s latest round of core enhancements, included in the X2 chips. The X2 4800+ also matches the Opteron 152 in many cases thanks to Socket 939’s faster memory subsystem. Remarkably, our test system consumes the same amount of power under load with an X2 4800+ in its socket as it does with an Athlon 64 FX-55, even though the X2 is running two rendering threads and doing nearly twice the work. Amazing.
There’s not much to complain about here, but that won’t stop me from trying. I would like to see AMD extend the X2 line down two more notches by offering a couple of Athlon 64 X2 variants at 2GHz clock speeds and lower prices. I realize that by asking for this, I may sound like a bit of a freeloader or something, but heyIntel’s doing it. No, the performance picture for Intel’s dual-core chips isn’t quite so rosy, but the lower-end Pentium D models will make the sometimes-substantial benefits of dual-core CPU technology more widely accessible. If AMD doesn’t follow suit, lots of folks will be forced to choose between one fast AMD core or two relatively slower Intel cores. I’m not so sure I won’t end up recommending the latter more often than the former.
Beyond that, the giant question looming over the Athlon 64 X2 is about availability, as in, “When can I get one?” Let’s hope the answer is sooner rather than later, because these things are sweet.