Part of the reason we’ve avoided doing so is that, let’s face it, it has the potential to be kind of cheesy. There’s much more to a CPU’s value proposition than a cold cost-benefit analysis can capture, and in truth, doing such an analysis well can prove rather tricky. That’s why you should read our CPU reviews and our system building guides to see what they recommend.
A vocal contingent of our readers has long been asking for a closer look at price-performance issues, and we think we’ve cooked up some novel ways of expressing that data that may make it feasible. So we’ve decided to give it a shot.
Fortuitously, AMD and Intel both took an axe to their prices last month, and we recently added Intel’s $113 Core 2 Duo E4300 to our constellation of test results, so now seems like a particularly appropriate time to consider performance per dollar. Join us as we look at the value proposition of 16 CPUs, from the Athlon 64 X2 3600+ all the way up to the Core 2 Extreme QX6800, across a wide range of games, applications, and even energy efficiency tests. Some of what we found surprised us, and it may change the way you think about CPU value.
Quantifying CPU value
In theory, quantifying value is easy. We can measure performace quite well, prices are easy to check, and dividing the former by the latter gives you performance per dollarexcept it’s not quite that simple, for a number of reasons.
First, processors aren’t always the only factors affecting performance in a given task or benchmark. Games, for example, tend to favor GPU pixel-pushing horsepower over CPU computational grunt. Memory bandwidth can limit performance, software and operating systems can be wildy inefficient, and don’t get us started on the bottlenecking potential of hard disk drives. Then there’s the issue of whether software makes the most of a processor’s abilities. This one is a particular problem for quad-core systems, since few applications are multithreaded well enough exploit them fully.
And, of course, there’s the question of cost. Sure, it’s easy to pull from official price lists, but bulk pricing doesn’t always track with street prices. Bare processor prices don’t take into account overall platform costs, either, or the cost of power consumption on your utility bill.
We’ve attempted to mitigate some of these issues by providing value analysis for a wide range of applications and processors, so we can draw conclusions based on overall trends rather than just a handful of numbers. We can illustrate which processors offer better value than others and under which circumstances.
Here’s a quick run-down of the specifications of the Intel processors we’ll be looking at today. We took these prices from Intel’s official processor price list, since street prices tend to vary from vendor to vendor and fluctuate without warning.
|Model||Clock speed||Cores||L2 cache (total)||Fab process||TDP||Price|
|Core 2 Duo E4300||1.8GHz||2||2MB||65nm||65W||$113|
|Core 2 Duo E6300||1.86GHz||2||2MB||65nm||65W||$163|
|Core 2 Duo E6400||2.13GHz||2||2MB||65nm||65W||$183|
|Core 2 Duo E6600||2.4GHz||2||4MB||65nm||65W||$224|
|Core 2 Duo E6700||2.66GHz||2||4MB||65nm||65W||$316|
|Core 2 Extreme X6800||2.93GHz||2||4MB||65nm||75W||$999|
|Core 2 Quad Q6600||2.4GHz||4||8MB||65nm||105W||$530|
|Core 2 Extreme QX6700||2.66GHz||4||8MB||65nm||130W||$999|
|Core 2 Extreme QX6800||2.93GHz||4||8MB||65nm||130W||$1199|
This is a classic example of CPU price structure. Intel’s Core 2 prices ramp up much quicker than key specs like clock speed, cache size, or the number of cores. For instance, the E6300 may have half the number of cores, one quarter the cache, and a 38% lower clock frequency than the flagship QX6800, but it costs 86% less. Or consider the E6600 and Q6600, both of which run at 2.4GHz. The latter is essentially twice the former, but the difference in price is actually close to 2.4 times. Spending more doesn’t necessarily get you an equitable boost in computational power, a trend we see continue with AMD’s offerings.
|Model||Clock speed||Cores||L2 cache (total)||Fab process||TDP||Price|
|Athlon 64 X2 3600+||1.9GHz||2||1MB||65nm||65W||$73|
|Athlon 64 X2 4400+||2.3GHz||2||1MB||65nm||65W||$121|
|Athlon 64 X2 5000+||2.6GHz||2||1MB||65nm||65W||$167|
|Athlon 64 X2 5600+||2.8GHz||2||2MB||90nm||89W||$188|
|Athlon 64 X2 6000+||3.0GHz||2||2MB||90nm||125W||$241|
|Athlon 64 FX-72||2.8GHz||4||4MB||90nm||125W x 2||$599|
|Athlon 64 FX-74||3.0GHz||4||4MB||90nm||125W x 2||$799|
AMD’s price range dips lower than Intel’s, but it also doesn’t reach beyond $799. Then again, the Athlon 64 FX-72 and FX-74 require a dual-socket motherboard that currently sells for more than $325, so there’s a considerable additional cost associated with that platform.
Here, also, prices ramp up faster than key specs. An FX-72 setup gets you the same clock speed, number of cores, and cache size as a pair of X2 5600+ processors, but it costs more than three times as much. Similarly, the Athlon 64 X2 3600+ gives up 37% of the clock speed and 50% of the cache of the 6000+, but sells for just 30% of the cost. Based on their specs alone, budget chips certainly look to have the best value propositions.
Charting relative value is a little new for us, so we’ve come up with a couple of ways to express performance per dollar. The first is the easiest: a simple graph depicting the value of a processor’s score in a given test—be it in frames per second, or as an encoding rate, or even an arbitrary benchmark score—divided by that processor’s price. In some cases, such as with media encoding, we’ve had to do a little multiplication to avoid generating value scores with too many decimal places to express succinctly. This doesn’t taint our results, though; it just makes them easier to read.
Our second tool for evaluating processor value comes in the form of a scatter plot, which looks like so:
Performance is tracked along the Y axis, and price along the X. Since we’re interested in chips that offer the best value, we’ll be looking for vertical progression on the performance axis with as little progression on the price axis as possible. Hypothetically, the best possible processor would sit at the top left of the plot, offering very high performance for free. Conversely, you wouldn’t want to buy a chip sitting at the bottom right of the plot, where price is high and performance is low.
Of course, picking a processor isn’t typically about what’s best as much as what sits in the mythical price-performance sweet spot. To determine that using our scatter plots, you’ll want to find the cutoff where either a) performance keeps increasing but starts to cost more and more, or b) performance stops going up significantly—or at all—with price. In a scatter plot like the one above, for instance, the latter would apply. There are exceptions to this rule, though, as we’ll see in the next few pages.
The scatter plot might look a little daunting, but it has the advantage of providing an instantaneous look at how price scales with performance. Most of us can afford a processor that costs a little more than the $73 Athlon 64 X2 3600+, so it’s helpful to be able to spot the best performing CPU within a given budget.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.
In some cases, getting the results meant simulating a slower chip with a faster one. For instance, our Core 2 Duo E6600 and E6700 processors are actually a Core 2 Extreme X6800 processor clocked down to the appropriate speeds. Their performance should be identical to that of the real thing. Similarly, our Athlon 64 FX-72 results come from an underclocked pair of Athlon 64 FX-74s, our Athlon 64 X2 4400+ is an underclocked X2 5000+ (both 65nm), and our Athlon 64 X2 5600+ is an underclocked Athlon 64 X2 6000+.
Our test systems were configured like so:
|Processor|| Core 2 Duo E4300 1.8GHz
Core 2 Duo E6300 1.86GHz
Core 2 Duo E6400 2.13GHz
Core 2 Duo E6600 2.4GHz
Core 2 Duo E6700 2.66GHz
Core 2 Extreme X6800 2.93GHz
Core 2 Quad Q6600 2.4GHz
Core 2 Extreme QX6700 2.66GHz
Core 2 Extreme QX6800 2.93GHz
| Athlon 64 X2 3600+ 1.9GHz (65nm)
Athlon 64 X2 4400+ 2.3GHz (65nm)
Athlon 64 X2 5000+ 2.6GHz (65nm)
Athlon 64 X2 5600+ 2.8GHz (90nm)
Athlon 64 X2 6000+ 3.0GHz (90nm)
| Athlon 64 FX-72 2.8GHz
Athlon 64 FX-74 3.0GHz
|System bus||1066MHz (266MHz quad-pumped)||1GHz HyperTransport||1GHz HyperTransport|
|Motherboard||Intel D975XBX2||Asus M2N32-SLI Deluxe||Asus L1N64-SLI WS|
|North bridge||975X MCH||nForce 590 SLI SPP||nForce 680a SLI|
|South bridge||ICH7R||nForce 590 SLI MCP||nForce 680a SLI|
|Chipset drivers||INF Update 18.104.22.1680
Intel Matrix Storage Manager 6.21
|ForceWare 15.00||ForceWare 15.00|
|Memory size||2GB (2 DIMMs)||2GB (2 DIMMs)||2GB (4 DIMMs)|
|Memory type||Corsair TWIN2X2048-6400C4
DDR2 SDRAM at 800MHz
DDR2 SDRAM at 800MHz
|Crucial Ballistix PC6400
DDR2 SDRAM at 800MHz
|CAS latency (CL)||4||4||4|
|RAS to CAS delay (tRCD)||4||4||4|
|RAS precharge (tRP)||4||4||4|
|Cycle time (tRAS)||12||12||12|
|Audio||Integrated ICH7R/STAC9274D5 with
Sigmatel 22.214.171.12474 drivers
|Integrated nForce 590 MCP/AD1988B with
Soundmax 126.96.36.19900 drivers
|Integrated nForce 680a SLI/AD1988B with
Soundmax 188.8.131.5200 drivers
|Hard drive||Maxtor DiamondMax 10 250GB SATA 150|
|Graphics||GeForce 7900 GTX 512MB PCIe with ForceWare 100.64 drivers|
|OS||Windows Vista Ultimate x64 Edition|
Our Core 2 Duo E6400 processor came to us courtesy of the fine folks up north at NCIX. Those of you who are up in Canada will definitely want to check them out as a potential source of PC hardware and related goodies.
Thanks to Corsair for providing us with memory for our testing. Their products and support are far and away superior to generic, no-name memory.
Also, all of our test systems were powered by OCZ GameXStream 700W power supply units. Thanks to OCZ for providing these units for our use in testing.
The test systems’ Windows desktops were set at 1280×1024 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled.
We used the following versions of our test applications:
- POV-Ray for Windows 3.7 beta 19a 64-bit
- Cinebench 9.5 64-bit Edition
- Windows Media Encoder 9 x64 Edition
- picCOLOR 4.0 build 598 64-bit
- notfred’s Folding benchmark CD 10/31/06 revision
- The Panorama Factory 4.4 x64 Edition
- CASE Lab Euler3d CFD benchmark 2.2
- MyriMatch proteomics benchmark
- Valve Source Engine particle simulation benchmark
- Valve VRAD map build benchmark
- LAME MT 3.97a 64-bit
- The Elder Scrolls IV: Oblivion 1.1
- Rainbow Six: Vegas 1.02
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
The Elder Scrolls IV: Oblivion
We tested Oblivion by manually playing through a specific point in the game five times while recording frame rates using the FRAPS utility. Each gameplay sequence lasted 60 seconds. This method has the advantage of simulating real gameplay quite closely, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent results. In addition to average frame rates, we’ve included the low frame rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.
For this test, we set Oblivion‘s graphical quality to “Medium” but with HDR lighting enabled and vsync disabled, at 800×600 resolution. We’ve chosen this relatively low display resolution in order to prevent the graphics card from becoming a bottleneck, so differences between the CPUs can shine through.
Notice the little green plot with four lines above the benchmark results. That’s a snapshot of the CPU utilization indicator in Windows Task Manager, which helps illustrate how much the application takes advantage of up to four CPU cores, when they’re available. I’ve included these Task Manager graphics whenever possible throughout our results. In this case, Oblivion really only takes full advantage of a single CPU core, although Nvidia’s graphics drivers use multithreading to offload some vertex processing chores.
In raw performance per dollar terms, the Athlon 64 X2 3600+ runs away with this one, followed at a distance by the Core 2 Duo E4300 and Athlon 64 X2 4400+. Quad-core performance per dollar looks pretty dismal in Oblivion, in part due to the fact that the game doesn’t actually take advantage of four cores.
Looking over to our scatter plot helps clarify a few things, though. Even at a display resolution of 800×600 with the detail turned down—settings that should highlight differences in CPU performance—variations in frame rate between the chips here are fairly small. We expect those differences to shrink even further once you turn up the detail, at which point performance will really be dependent on the graphics card more than anything.
Considering even the Athlon 64 X2 3600+ can get you an average of 80 FPS (and a minimum of 57 FPS, close to the 60Hz cap on many LCD monitors) in this scenario where the GPU isn’t a bottleneck, we wouldn’t recommend spending much more on a processor to run this game.
Rainbow Six: Vegas
Rainbow Six: Vegas is based on Unreal Engine 3 and is a port from the Xbox 360. For both of these reasons, it’s one of the first PC games that’s multithreaded, and it ought to provide an illuminating look at CPU gaming performance.
For this test, we set the game to run at 800×600 resolution with high dynamic range lighting disabled. “Hardware skinning” (via the GPU) was disabled, leaving that burden to fall on the CPU. Shadow quality was set to very low, and motion blur was enabled at medium quality. I played through a 90-second sequence of the game’s Terrorist Hunt mode on the “Dante’s” level five times, capturing frame rates with FRAPS, as we did with Oblivion.
What we’ve just said for Oblivion rings even truer for Rainbow Six: Vegas. Here at 800×600, there’s a difference of just 13.2 FPS between the $113 Core 2 Duo E4300 and the $1,199 Core 2 Extreme QX6800. Once you turn up the detail, that gap will likely narrow even more. Again, therefore, we wouldn’t recommend spending all that much on a processor to run this game. Our FPS-per-dollar chart echoes this: the best value clearly lies with the Athlon 64 X2 3600+ and the Core 2 Duo E4300. (The X2 4400+ can be counted out, since the E4300 offers higher performance for less.)
Valve Source engine particle simulation
Next up are a couple of tests we picked up during a visit to Valve Software, the developers of the Half-Life games. They’ve been working to incorporate support for multi-core processors into their Source game engine, and they’ve cooked up a couple of benchmarks to demonstrate the benefits of multithreading.
The first of those tests runs a particle simulation inside of the Source engine. Most games today use particle systems to create effects like smoke, steam, and fire, but the realism and interactivity of those effects is limited by the available computing horsepower. Valve’s particle system distributes the load across multiple CPU cores.
More CPU-bound applications like Valve’s particle simulation benchmark also give the X2 3600+ the top spot in value terms, but they nonetheless see very significant performance gains from faster chips. The Core 2 Quad Q6600’s score is nearly four times that of the X2 3600+, for instance. Still, it’s hard to argue with cold numbers: the Q6600’s price tag is over seven times that of the 3600+’s, and it sits in the bottom half of our score point/dollar chart, clearly bested by dual-core chips on the value scale despite their significantly lower performance.
Within the dual-core realm, a glance at our value chart and scatter plot suggests the Core 2 Duo E4300, the Athlon 64 X2 5600+, and the Core 2 Duo E6600 are the way to go if you’d like a little extra performance over the Athlon 64 X2 3600+. The X2 4400+ doesn’t have bad value proposition, but it’s no match for the C2D E4300, as our scatter plot shows.
Valve VRAD map compilation
This next test processes a map from Half-Life 2 using Valve’s VRAD lighting tool. Valve uses VRAD to precompute lighting that goes into its games. This isn’t a real-time process, and it doesn’t reflect the performance one would experience while playing a game. It does, however, show how multiple CPU cores can speed up game development.
This second Valve test mirrors the first one somewhat. There’s a drastic gap between dual-core and quad-core solutions on the raw performance scale, but high prices prevent quad-core chips from faring all that well on the value scale. Quad-core offerings are obviously your best bet if money is no object (or if you need to compile maps on a regular basis, as the time savings from faster compiles really do add up), but the rest of us will be more interested in chips like the Athlon 64 X2 3600+, the Core 2 Duo E4300, the Core 2 Duo E6400, and the Core 2 Duo E6600. From best to worst value, those appear to be the best deals in the dual-core segment.
The Panorama Factory
The Panorama Factory handles an increasingly popular image processing task: joining together multiple images to create a wide-aspect panorama. This task can require lots of memory and can be computationally intensive, so The Panorama Factory comes in a 64-bit version that’s multithreaded. We asked it to join four pictures, each eight megapixels, into a glorious panorama of the interior of Damage Labs. The program’s timer function captures the amount of time needed to perform each stage of the panorama creation process. We’ve also added up the total operation time to give us an overall measure of performance.
The scatter plot from our panorama stitching benchmark shows a pattern reminiscent of the previous two tests, but here the gap between dual- and quad-core chips is much narrower. As a result, the Core 2 Quad Q6600 is even worse off in our value chart.
On the dual-core front, where the value lies once again, we see an interesting pattern. AMD’s dual-core offerings clearly outpace the competition from Intel, as highlighted by their greater proximity to the Y axis on our scatter plot. Take your pick here, but remember value still goes down the higher you climb on the price ladder. Barring the X2 3600+, we’d probably pick the X2 4400+ or the X2 5600+ ourselves.
This is as good a time as any for a brief intermission to point out some other trends we’re seeing. So far, AMD’s Athlon 64 FX-72 and FX-74 offerings fare quite poorly—and they’d fare even worse if we factored in the price premium for the only Quad FX motherboard out today, which costs around $330. We also see Intel’s Core 2 Extreme X6800 processor consistently dipping toward the bottom right of our scatter plot, where value is worst. It really doesn’t pay to be a premium dual-core chip in this day and age.
picCOLOR was created by Dr. Reinert H. G. MÃ¼ller of the FIBUS Institute. This isn’t Photoshop; picCOLOR’s image analysis capabilities can be used for scientific applications like particle flow analysis. Dr. MÃ¼ller has supplied us with new revisions of his program for some time now, all the while optimizing picCOLOR for new advances in CPU technology, including MMX, SSE2, and Hyper-Threading. Naturally, he’s ported picCOLOR to 64 bits, so we can test performance with the x86-64 ISA. Eight of the 12 functions in the test are multithreaded, and in this latest revision, five of those eight functions use four threads.
Scores in picCOLOR, by the way, are indexed against a single-processor Pentium III 1 GHz system, so that a score of 4.14 works out to 4.14 times the performance of the reference machine.
Things are a little tighter in our picCOLOR benchmark. Here, the 3600+, E4300, 5000+, 5600+, and E6600 appear to be the best deals among dual-core offerings. The 4400+ isn’t a bad choice by any means, but the E4300 outperforms it by a hair while costing slightly less.
Up on the quad-core front, the Q6600 is clearly the best choice here, although we shouldn’t need to remind you that it still sits in the bottom half of our performance per dollar chart. In other words, it’s a great chip if you can afford it and need the extra performance, but it’s not a K-Mart blue-light special.
Windows Media Encoder x64 Edition
Windows Media Encoder is one of the few popular video encoding tools that uses four threads to take advantage of quad-core systems, and it comes in a 64-bit version. For this test, I asked Windows Media Encoder to transcode a 153MB 1080-line widescreen video into a 720-line WMV using its built-in DVD/Hardware profile. Because the default “High definition quality audio” codec threw some errors in Windows Vista, I instead used the “Multichannel audio” codec. Both audio codecs have a variable bitrate peak of 192Kbps.
In Windows Media encoding, the very same chips come out on top as in our picCOLOR benchmark: the 3600+, E4300, 5000+, 5600+, and E6600, in order from the highest performance per dollar to the lowest.
The Q6600 is in the same spot as before, too, although this test gives us a more concrete example to illustrate its position. If you look at our encoding numbers, the Q6600 encodes our file 302.3 seconds (just over five minutes) faster than the E6600. If you don’t encode movies very often then that’s probably not a big deal, but if this is a task you carry out regularly, then those chunks of five minutes may turn into hours. It’s up to you whether you think that time saving is worth the $306 difference between the E6600 and Q6600.
LAME MP3 encoding
LAME MT is a multithreaded version of the LAME MP3 encoder. LAME MT was created as a demonstration of the benefits of multithreading specifically on a Hyper-Threaded CPU like the Pentium 4. Of course, multithreading works even better on multi-core processors. You can download a paper (in Word format) describing the programming effort.
Rather than run multiple parallel threads, LAME MT runs the MP3 encoder’s psycho-acoustic analysis function on a separate thread from the rest of the encoder using simple linear pipelining. That is, the psycho-acoustic analysis happens one frame ahead of everything else, and its results are buffered for later use by the second thread. That means this test won’t really use more than two CPU cores.
We have results for two different 64-bit versions of LAME MT from different compilers, one from Microsoft and one from Intel, doing two different types of encoding, variable bit rate and constant bit rate. We are encoding a massive 10-minute, 6-second 101MB WAV file here, as we have done in many of our previous CPU reviews.
Surprise, surprise. The 3600+, E4300, 5000+, 5600+, and E6600 offer the four best value propositions in this test, as well. The Athlon 64 X2 6000+ nonetheless sits fairly close to the E6600 in terms of performance per dollar, and its raw performance is a wee bit higher.
Coughing up the extra dough for the Q6600 here is quite counterproductive, since this application can only use two cores at once.
Graphics is a classic example of a computing problem that’s easily parallelizable, so it’s no surprise that we can exploit a multi-core processor with a 3D rendering app. Cinebench is the first of those we’ll try, a benchmark based on Maxon’s Cinema 4D rendering engine. It’s multithreaded and comes with a 64-bit executable. This test runs with just a single thread and then with as many threads as CPU cores are available.
Cinebench breaks the flow of gentle alternance between AMD and Intel chips by giving AMD’s offerings the clear upper hand pretty much across the board. Even the ill-fated Athlon 64 FX-72 and FX-74 chips outperform the Q6600, although Intel’s quad-core contender gets a better score in our performance per dollar chart. Considering the costs associated with AMD’s Quad FX platform, the Q6600 is what we’d recommend for this application if your budget allows room for a quad-core solution.
We’ve finally caved in and moved to the beta version of POV-Ray 3.7 that includes native multithreading. The latest beta 64-bit executable is still quite a bit slower than the 3.6 release, but it should give us a decent look at comparative performance, regardless. Performance per dollar values were generated using the performance of each CPU with four threads.
The situation we just saw in Cinebench is both mirrored and amplified in POV-Ray. AMD’s lead is so pronounced here that the FX-72 manages to overtake the Q6600 in our performance per dollar chart. Looking at the scatter plot shows why: the AMD four-core offering has a huge performance lead over its Intel competitor, but the price difference between the two is very slight. Quad FX may yet be worth the outrageously expensive motherboard and increased power bills in POV-Ray.
Our benchmarks sometimes come from unexpected places, and such is the case with this one. David Tabb is a friend of mine from high school and a long-time TR reader. He recently offered to provide us with an intriguing new benchmark based on an application he’s developed for use in his research work. The application is called MyriMatch, and it’s intended for use in proteomics, or the large-scale study of protein. I’ll stop right here and let him explain what MyriMatch does:
In shotgun proteomics, researchers digest complex mixtures of proteins into peptides, separate them by liquid chromatography, and analyze them by tandem mass spectrometers. This creates data sets containing tens of thousands of spectra that can be identified to peptide sequences drawn from the known genomes for most lab organisms. The first software for this purpose was Sequest, created by John Yates and Jimmy Eng at the University of Washington. Recently, David Tabb and Matthew Chambers at Vanderbilt University developed MyriMatch, an algorithm that can exploit multiple cores and multiple computers for this matching. Source code and binaries of MyriMatch are publicly available.
In this test, 5555 tandem mass spectra from a Thermo LTQ mass spectrometer are identified to peptides generated from the 6714 proteins of S. cerevisiae (baker’s yeast). The data set was provided by Andy Link at Vanderbilt University. The FASTA protein sequence database was provided by the Saccharomyces Genome Database.
MyriMatch uses threading to accelerate the handling of protein sequences. The database (read into memory) is separated into a number of jobs, typically the number of threads multiplied by 10. If four threads are used in the above database, for example, each job consists of 168 protein sequences (1/40th of the database). When a thread finishes handling all proteins in the current job, it accepts another job from the queue. This technique is intended to minimize synchronization overhead between threads and minimize CPU idle time.
The most important news for us is that MyriMatch is a widely multithreaded real-world application that we can use with a relevant data set. MyriMatch also offers control over the number of threads used, so we’ve tested with one to four threads. Also, this is a newer version of the MyriMatch code than we’ve used in the past, with a larger spectral collection, so these results aren’t comparable to those in some of our past articles.
Value scores were generated based on the performance of each chip with four threads.
AMD’s victory is short-lived. With 3D rendering tests behind us, we return to our progression of Athlon 64 X2 3600+, Core 2 Duo E4300, Athlon 64 X2 5000+, Athlon 64 X2 5600+, and Core 2 Duo E6600 as the five chips that offer the best value as we climb up the performance scale. There’s inarguably a pattern here.
Naturally, being a thoroughly multithreaded application, MyriMatch gives a sizeable performance advantage to the Core 2 Quad Q6600. The chip towers above its dual-core siblings and AMD’s Quad FX chips, only bested (and slightly so) by Intel’s own QX6700 and QX6800. It’s clear which chip to get if you’re more concerned about performance than value.
STARS Euler3d computational fluid dynamics
Our next benchmark is also a relatively new one for us. Charles O’Neill works in the Computational Aeroservoelasticity Laboratory at Oklahoma State University, and he contacted us recently to suggest we try the computational fluid dynamics (CFD) benchmark based on the STARS Euler3D structural analysis routines developed at CASELab. This benchmark has been available to the public for some time in single-threaded form, but Charles was kind enough to put together a multithreaded version of the benchmark for us with a larger data set. He has also put a web page online with a downloadable version of the multithreaded benchmark, a description, and some results here. (I believe the score you see there at almost 3Hz comes from our eight-core Clovertown test system.)
In this test, the application is basically doing analysis of airflow over an aircraft wing. I will step out of the way and let Charles explain the rest:
The benchmark testcase is the AGARD 445.6 aeroelastic test wing. The wing uses a NACA 65A004 airfoil section and has a panel aspect ratio of 1.65, taper ratio of 0.66, and a quarter-chord sweep angle of 45Âº. This AGARD wing was tested at the NASA Langley Research Center in the 16-foot Transonic Dynamics Tunnel and is a standard aeroelastic test case used for validation of unsteady, compressible CFD codes.
The CFD grid contains 1.23 million tetrahedral elements and 223 thousand nodes . . . . The benchmark executable advances the Mach 0.50 AGARD flow solution. A benchmark score is reported as a CFD cycle frequency in Hertz.
So the higher the score, the faster the computer. I understand the STARS Euler3D routines are both very floating-point intensive and oftentimes limited by memory bandwidth. Charles has updated the benchmark for us to enable control over the number of threads used. Here’s how our contenders handled the test with different thread counts.
Value scores were generated based on the performance of each chip with four threads.
Much like our 3D rendering test strongly favored AMD chips, our computational fluid dynamics benchmark visibly gives the advantage to Intel’s lineup. The 3600+ as always tops of the performance per dollar chart, but beyond that, the E4300, E6300, E6400, and E6600 beat their AMD counterparts. Despite the fact that this app clearly sees large benefits from quad-core processors—the Q6600’s score is a testament to that—AMD’s Quad FX offerings are both quashed by the lowly Core 2 Duo E6600.
Next, we have another relatively new addition to our benchmark suite: a slick little Folding@Home benchmark CD created by notfred, one of the members of Team TR, our excellent Folding team. For the unfamiliar, Folding@Home is a distributed computing project created by folks at Stanford University that investigates how proteins work in the human body, in an attempt to better understand diseases like Parkinson’s, Alzheimer’s, and cystic fibrosis. It’s a great way to use your PC’s spare CPU cycles to help advance medical research. I’d encourage you to visit our distributed computing forum and consider joining our team if you haven’t already joined one.
The Folding@Home project uses a number of highly optimized routines to process different types of work units from Stanford’s research projects. The Gromacs core, for instance, uses SSE on Intel processors, 3DNow! on AMD processors, and Altivec on PowerPCs. Overall, Folding@Home should be a great example of real-world scientific computing.
notfred’s Folding Benchmark CD tests the most common work unit types and estimates performance in terms of the points per day that a CPU could earn for a Folding team member. The CD itself is a bootable ISO. The CD boots into Linux, detects the system’s processors and Ethernet adapters, picks up an IP address, and downloads the latest versions of the Folding execution cores from Stanford. It then processes a sample work unit of each type.
On a system with two CPU cores, for instance, the CD spins off a Tinker WU on core 1 and an Amber WU on core 2. When either of those WUs are finished, the benchmark moves on to additional WU types, always keeping both cores occupied with some sort of calculation. Should the benchmark run out of new WUs to test, it simply processes another WU in order to prevent any of the cores from going idle as the others finish. Once all four of the WU types have been tested, the benchmark averages the points per day among them. That points-per-day average is then multiplied by the number of cores on the CPU in order to estimate the total number of points per day that CPU might achieve.
This may be a somewhat quirky method of estimating overall performance, but my sense is that it generally ought to work. We’ve discussed some potential reservations about how it works here, for those who are interested. I have included results for each of the individual WU types below, so you can see how the different CPUs perform on each.
The Athlons fare significantly better than their Core 2 counterparts with Tinker and Amber work units, but Intel claws its way back into the picture when we switch to Gromacs. Performance per dollar varies, then, and that makes the value proposition of these CPUs for Folding@Home very much contingent on the kinds of work units Stanford will be issuing in the future.
Our scatter plot only considers total projected points per day, which is based on performance per dollar for all work unit types. With that metric, AMD’s chips look to be the better options. Things look a little better for Intel on the quad-core front, with the Q6600 just edging out the FX-72 in our performance per dollar chart.
Power consumption and efficiency
Our Extech 380803 power meter has the ability to log data, so we can capture power use over a span of time. The meter reads power use at the wall socket, so it incorporates power use from the entire systemthe CPU, motherboard, memory, video card, hard drives, and anything else plugged into the power supply unit. (We plugged the computer monitor and speakers into a separate outlet, though.) We measured how each of our test systems used power during a roughly one-minute period, during which time we executed Cinebench’s multithreaded rendering test. All of the systems had their power management features (such as SpeedStep and Cool’n’Quiet) enabled during these tests.
Complete results are available in our Core 2 Extreme QX6800 review here, but for this article, we’re looking at the amount of energy used by each system to render the scene. This method should account for both power use and, to some degree, performance, because shorter render times may lead to less energy consumption.
You’ll notice that we’ve not included the Athlon 64 FX-72 here. That’s because our “simulated” FX-72 CPUs are underclocked versions of faster processors, and we’ve not been able to get Cool’n’Quiet power-saving tech to work when CPU multiplier control is in use. We have included our simulated Core 2 Duo E6600 and E6700, because SpeedStep works fine on the D975XBX2 motherboard alongside underclocking. The simulated processors’ voltage may not be exactly the same as what you’d find on many retail E6600s and E6700s. However, voltage and power use can vary from one chip to the next, since Intel sets voltage individually on each chip at the factory.
The 1/microjoules value in our power efficiency per dollar graph is really 1/(watt-seconds/1000000), or 1,000,000 m-2 kg-1 s2. That’s a little obscure, but it quantifies power efficiency in a readable fashion based on the source data, which is in joules. We’re looking at power efficiency per dollar.
Despite having nearly the highest render energy of the lot, the X2 3600+’s bargain basement price keeps it atop the power efficiency per dollar standings. The E4300 claims second place, offering higher power efficiency and a lower price than the 4400+. Our bronze medal winner is the E6400, whose power efficiency is quite substantially above the E6300’s, despite the small pricing gap between the two chips.
That said, we couldn’t get away without mentioning the Core 2 Quad Q6600, which tops the power efficiency scale despite its fairly reasonable price tag. If you do 3D rendering work for Al Gore, this is the chip to get. Still, it’s worth pointing out that the Q6600 doesn’t have the lowest idle power consumption (see our full results here), so the X2 3600+ may yet be the friendliest to your power bill if you don’t run compute-intensive tasks very often.
So far, we’ve quantified performance per dollar and relative value across a wide range of applications. We also wanted to come up with an aggregate score that distilled those results into a single scatter plot for easy reference. This aggregate score is by no means the be-all and end-all of processor value propositions, but it summarizes the results we’ve stepped through on the preceding pages. To generate this value score, we averaged the percentage performances of each CPU against our Athlon 64 X2 3600+ baseline. Each application’s performance was weighted equally. We also left the render energy results out of this calculation, since that measure of energy efficiency is quite different from a benchmark score.
It would be unwise to draw too many conclusions based on this aggregate score alone. (You didn’t skip ahead to this page, did you?) This sort of accumulation doesn’t give us a singularly authoritative number to quantify CPU performance, but it does give us a good idea of how chips handle overall across our test suite.
Unsurprisingly, we see the same five chips come out on top as in many of our isolated tests: the 3600+, E4300, 5000+, 5600+, and E6600. We should, however, point out that the 4400+, E6400, and 6000+ sit very close to their respective competitors on our scatter plot, so they’re not necessarily bad choices—they’re just not the best. If you do a lot of 3D rendering and select the 6000+ for your own PC, for instance, you shouldn’t sacrifice much in terms of overall performance compared to the E6600.
Among quad-core processors, the picture is much clearer. The Q6600 is quite obviously the most sensible choice compared to both AMD’s underperforming Quad FX chips and Intel’s overpriced Core 2 Extreme offerings.
After running 16 processors through a lengthy round of testing and analysis, we finally succeeded in unmasking the “final five”—the five chips that offer the best overall performance per dollar across our tests. If you haven’t been paying attention, those are the Athlon 64 X2 3600+, the Core 2 Duo E4300, the Athlon 64 X2 5000+, the Athlon 64 X2 5600+, and the Core 2 Duo E6600.
This discovery shows that repeated price cuts have actually helped AMD’s processor lineup remain competitive on the value front, even in the face of Intel’s own price reductions and the launch of the $113 Core 2 Duo E4300. Whether AMD can keep this up until the rumored November-December release time frame of its Phenom chips remains to be seen, however.
Aside from our final five, Intel’s Core 2 Quad Q6600 receives an honorable mention for being the best overall choice for users more focused on performance than on saving moneythough still concerned about both. $530 is a fair amount to spend on a CPU, but the performance divide between the Q6600 and dual-core offerings is large enough to justify that premium for pretty much any performance-conscious user. In the same playing field, AMD’s Quad FX platform fails to impress, and Intel’s quad-core Core 2 Extreme chips are just too expensive. A word of caution, though: a number applications still don’t benefit from the Q6600’s extra cores at all—games and LAME MP3 encoding come to mind—so this chip isn’t a performance panacea for everyone.
So there you have our first look at the performance per dollar of today’s CPUs. Considering the heated price war currently raging between AMD and Intel, we expect this subject to be one we’ll revisit in the future.