Home Core Duo on the desktop
Reviews

Core Duo on the desktop

Scott Wasson
Disclosure
Disclosure
In our content, we occasionally include affiliate links. Should you click on these links, we may earn a commission, though this incurs no additional cost to you. Your use of this website signifies your acceptance of our terms and conditions as well as our privacy policy.

THE CORE DUO MAY WELL BE Intel’s best microprocessor. Conceived initially as a mobile processor, Core Duo is the dual-core descendant of the Pentium M and the direct predecessor to Intel’s next-generation microarchitecture, versions of which are due to arrive in desktops, servers, workstations, and laptops later this year. You can understand, then, why we have been waiting impatiently, wriggling in our chairs, until somebody—anybody—actually delivered a desktop-style motherboard for this puppy. This high-performance, low-power processor could be an ideal centerpiece for a fast but quiet desktop system or a killer home theater PC. Equipped with desktop-class cooling and overclocked accordingly, the Core Duo might turn into a fire-breathing titan of performance—and it could give us something of a glimpse of Intel’s future, as well.

One of the first Core Duo desktop mobos, the N4L-VM DH from Asus, finally arrived in Damage Labs just over a week ago. Since then, we’ve been busily testing this processor against a broad range of potential competitors, from other mobile-on-desktop options like the Turion 64 and Pentium M to the latest dual-core desktop processors like the Pentium Extreme Edition 965 and the Athlon 64 FX-60. Can the Core Duo really hold its own against today’s fastest desktop processors? What we found may surprise you.

Core Duo close up
If you’ve been hanging around here for a while, you may have heard us referring to Core Duo by its code name, Yonah, long before Intel decided to give it a somewhat confusing official name. We previewed Yonah after last fall’s Intel Developer Forum, explaining some of the features that make it unique. Like other dual-core processors such as the Pentium D and the Athlon 64 X2, the Core Duo joins together a pair of CPU cores on a single chip. In the case of the Core Duo, those CPU cores are massaged and tweaked versions of the Pentium M processor, familiar as part of Intel’s Centrino mobile platform. (The Pentium M itself traces its heritage back through many earlier Pentiums, including the Pentium III and Pentium Pro; we’ve covered the Pentium M at some length in the past.)

Unlike the Pentium D, however, the Core Duo benefits from a very intentional dual-core design. In fact, the Core Duo’s two cores are arguably more tightly integrated than those in AMD’s dual-core Athlon and Opteron processors. Each of the Duo’s cores has a 32KB L1 cache for data and another of the same size for instructions, but the cores share a common 2MB L2 cache via an internal, on-chip bus. Space in the L2 cache is allocated dynamically, so either core can allocate up to the full 2MB of cache for itself, should the other one not need it. The two cores can effectively share the data in the L2 cache, as well. This use of a single, unified L2 cache greatly simplifies the management of cache coherency, especially in a single-socket system like a laptop or desktop PC. With the Pentium D, by contrast, cache coherency updates must constantly be passed across the system’s front-side bus, even in a single-socket system.

Core Duo is more than just a pair of Pentium Ms made to share a cache, though. Intel’s Israel-based CPU design team has modified the Pentium M design in order to address some of its performance shortcomings, especially in terms of multimedia performance. Simply going from a single core to two will inevitably help speed up tasks like video encoding, where the software is typically multithreaded. But Yonah also supports the group of 13 new instructions known as SSE3, handles some SSE2 instructing like Shuffle and Unpack up to 30% faster, and is capable of using its instruction-grouping abilities (known as micro-ops fusion) on some SSE instructions, improving overall throughput. These and other enhancements should help alleviate some of the Pentium M’s few performance weaknesses compared to today’s desktop processors.

Of course, none of these enhancements would matter much if Intel couldn’t fit the Core Duo into laptops with about the same size, weight, battery power, and cooling capabilities as the Pentium M. In order to make that happen, Intel has arrayed a number of technologies in Core Duo’s favor, not least of which is its 65-nanometer chip fabrication process. The process shrink means Core Duo’s 151 million transistors can reside in an area only 90 mm2—barely any larger than the single-core Pentium M “Dothan” at 84 mm2 when manufactured with Intel’s 90-nano process. The Core Duo also benefits from the fortuitous effects of multi-core processor designs on power consumption; by keeping clock frequencies relatively low and doubling up on CPU logic, a dual-core CPU can generally achieve better performance per watt than a single-core CPU, provided that multithreaded software is reasonably abundant.

The Core Duo T2600 processor These moves to 65nm and dual cores are big steps in the fight to keep power consumption in check, but Intel didn’t stop there. They’ve also given the Core Duo a range of power management techniques that can reduce power use when part or all of the processor is idle. The two cores can independently manage some of their own traditional low-power states or C-states, such as Halt, Stop-Clock, and Sleep, so that one core could enter a lower power mode while the other cranks away on a thread. More innovatively, the Core Duo can choose to deactivate portions of its shared L2 cache in stages if the current applications don’t require full use of the cache. Unneeded parts of the cache are flushed to memory and temporarily shut down. Should the CPU become so idle that the entire L2 cache can be flushed to RAM, the Core Duo will enter what Intel calls an Enhanced Deeper Sleep mode. In this state, without the need to power the L2 cache, the processor can operate at even lower voltages. Such C-state transitions happen in fractions of a second, so entering or recovering from a low-power state should be largely imperceptible to the end user. Core Duo also carries over Intel’s Enhanced SpeedStep dynamic clock speed and voltage scaling feature, of course, and it adds new thermal sensors on each CPU core near likely hotspots on the chip.

Thanks to all of these changes, Intel rates the TDP, or thermal design power, of the first wave of Core Duo chips—ranging from 1.66GHz to 2.166GHz—at only 31W. (TDP is a design target for cooling solutions and doesn’t necessarily represent the peak power draw of the part.) That’s up only slightly from the 27W TDP of the faster variants of the Dothan, like the Pentium M 770 at 2.13GHz. Intel makes a single-core variant of Yonah called the Core Solo, and its TDP is 27W. There are also low-voltage versions of Core Duo with TDPs as low as 15W. All in all, that’s a mighty impressive achievement for a processor with this sort of performance.

The Core Duo is packed to the hilt with the latest features, including SSE3, support for the Execute Disable Bit for better antivirus protection, and Intel’s new VT virtualization technology. The one omission from this list of CPU enhancements is a notable one, though: Core Duo lacks support for EM64T, Intel’s version of the 64-bit extensions to the x86 instruction set pioneered by AMD. Without 64-bit support, the Core Duo can’t easily address more than 4GB of memory, and it loses out on the potential performance gains offered by x86-64’s additional registers. Being stuck at under 4GB won’t matter much for the Core Duo’s success in mobile applications, and it’s not likely to harm its prospects inside of Intel’s Viiv-branded home theater PCs. For really beefy desktops, workstations, and especially servers, though, 64-bit support is becoming much more important with time. We will have to wait for Core Duo’s successor, code-named Merom and based on Intel’s next-generation Core microarchitecture, for 64-bit capabilities in an Intel mobile processor.

Core Duo is part of Intel’s newfangled Centrino platform, code-named Napa. The Core Duo’s main companion in the Napa scheme is the mobile version of Intel’s 945G core logic chipset, the 945GM. The 945GM north bridge chip is linked to Core Duo by means of a 667MHz system bus, and the 945GM’s memory controller supports dual channels of DDR2 memory at up to 667MHz. That gives the chipset twice the effective bandwidth of the front-side bus, a disparity Intel has accepted as normal. In many Centrino-based laptops, the excess bandwidth will be used to feed the 945G’s integrated graphics processor, the GMA 950.

In part because it needs this new chipset with support for higher bus speeds, the Core Duo is not a drop-in replacement for the Pentium M. In fact, although it has 479 pins on its belly, the Core Duo uses a different pin layout than the Pentium M, making it physically incompatible with older Socket 479 motherboards.

 

Bringing it to the desktop: Asus’ N4L-VM DH
Despite its new pin layout, the Core Duo’s transition to the desktop has been eased considerably by the success of Pentium M desktop boards and by the inclusion of the Core Duo in a non-mobile role as part of Intel’s Viiv platform for home theater PCs. The Viiv platform requires a dual-core Intel processor, and obviously, Core Duo is a more attractive prospect for system builders looking to produce a quiet PC for the living room than the hotter, power-hungry Pentium D. A number of motherboard makers have announced mobos for the Core Duo, many of them designed to fit into the mini-ATX form factor and bristling with the ports needed for driving home-theater-class audio and video equipment.


The Asus N4L-VM DH motherboard

The Asus N4L-DM VH was the first of these boards to make its way into our labs, and so it captured our undivided attention. This is a micro-ATX-sized board with little of the flair of Asus’s high-end mobos, but what’s important is its ability to support a Core Duo CPU. Asus went all out on the mobile theme with this one, basing the board on a true mobile chipset, the 945GM north bridge and ICH7-M south bridge—it’s like a laptop board with desktop-style DIMM and PCI slots. Some other manufacturers have opted to use desktop chipsets like the 975X in their Core Duo desktop boards, but the mobile chipset ought to draw less power and generate less heat than a desktop version would. As you can see, the north bridge requires only passive cooling, and the ICH7-M needs no heat sink at all.

The mobile chipset is probably to blame for the lack of expansion options, though. The board has only two DIMM slots (although that would matter more if the Core Duo had 64-bit addressing) and only two SATA ports off of the ICH7-M. The other two SATA ports come from an auxiliary SATA controller chip from JMicron, and one of those two ports is an external eSATA job.


Lotta audio, but very little video.

The port selection around back of the N4L-VM DH is a little bit bewildering for a “digital home” type product aimed at home theater PCs. Things start out well. The board has eight-channel analog audio ports, as well as optical and coaxial outputs for digital audio—just what you’d want. It also has that eSATA port for adding external storage, and the Ethernet port is of the Gigabit variety. All of these things make sense. However, the N4L-VM’s only video output is a VGA port—there’s no DVI port for driving an HDTV, no HDMI connector, no component output, and no TV out of any kind. Fortunately, Asus does sell a DVI-out card separately, but these choices are a bit puzzling for a potential HTPC board. Perhaps they simply expected folks to use the PCI-E x16 slot for graphics and only included the VGA port since the graphics capabilities were already built into the chipset. At any rate, my ideal HTPC motherboard would include a better selection of video output ports and an 802.11g wireless networking capability.

You’re not likely to receive your Core Duo processor with a “stock” cooler, and so Asus includes a custom one with the motherboard, to be secured by a pair of metal hoops protruding from either side of the CPU socket. Installing and removing the thing is fairly easy, and the fan is virtually silent when running under normal loads with the board’s temperature-based fan speed control enabled. It does become audible when the CPU is running something intensive like a multi-threaded rendering app, but even then, it’s barely noticeable from a few feet away.

Just for the sake of comparison, here’s a look at Asus’s Core Duo cooler next to the type of cooler we use to keep a Pentium Extreme Edition processor running OK without too much noise: a Zalman CNPS9500 LED.


We are not making this up.

My only complaint about the N4L-VM’s cooling setup is that it doesn’t allow for the mounting of any type of standard desktop cooler, like a Socket 478-compliant one. The ability to use a beefier cooler might lead to some intriguing options, like a totally passive CPU cooling setup or particularly egregious overclocking.

Trouble in paradise?
Obviously, our N4L-VM DH review unit was one of the first of these boards available, and it still had some rough edges. After too many hours of troubleshooting, we were able to pinpoint a problem with the board’s AHCI implementation on its two ICH7-M-connected SATA ports. AHCI is the SATA extension that allows for newer features like device hot-plugging and Native Command Queuing. NCQ can affect performance, so we naturally wanted to have it enabled, but turning on AHCI resulted in a more or less constant 40% CPU utilization in Windows, apparently caused by hardware interrupts. Our only option was to turn off AHCI and test performance without NCQ enabled. Asus confirmed to us that they were able to reproduce the problem, and they say they’re looking into it with Intel.

Another problem we encountered was an annoying tendency for the system to fail to come back after a warm reboot about 50% of the time. We’d have to shut down the system power at the PSU, forcing a cold boot, in order to recover. Each time we went through this process, the board would come up during POST after the cold boot and say that overclocking had failed. We’d then have to press F1 enter the BIOS and exit in order to for the system to continue booting. This is what’s known in the industry as “really frickin’ annoying.” We tried using a couple of different big name brands of 1GB DDR2 DIMMs in the board—Corsair and Crucial—and had the same problem with both. Perhaps dialing back to way more conservative memory timings would have helped, but the 3-4-4-10 timings we used were within the tolerances of the DIMMs, and the system was otherwise stable in MemTest86+ and in every application in our test suite. Let’s hope Asus can resolve this problem, and the AHCI issue, with a future BIOS update.

The last problem we ran into on our way to mobile-on-desktop nirvana was the N4L-VM DH’s incredibly limited set of BIOS options for overclocking. There’s no control over CPU voltage, no option to set the memory frequency, no option to change the CPU multiplier, no means of controlling clock ratios between different components. The board does offer your choice of DDR2 voltages from 1.8V to 2.1V, which is definitely better than nothing.


1996 called. They want their BIOS overclocking menu back.

We tried overclocking the Core Duo, but when the system wouldn’t boot our stock 2.16GHz processor at 2.3GHz, we were left with no real tweaking options to help resolve the problem.

 

Test notes
The Core Duo’s lack of support for 64-bit extensions created a dilemma for us when we set out to test it against the competition. We know that 64-bit support can sometimes lead to faster performance, and we’ve already found 64-bit versions of many of our test suite applications. We didn’t want to deprive the 64-bit chips their chance to strut their stuff, and we had a big bunch of 64-bit test results already in the can for range of desktop processors. What to do?

We decided to test the Core Duo and the Pentium M the only way we could: with 32-bit apps in a 32-bit OS. We would then test the Turion 64 and the desktop chips with 64-bit code on a 64-bit OS where possible, but use the Turion 64 ML-44 as our “bridge” to the 32-bit world. We tested the ML-44 in both 32 and 64 bits, in order to demonstrate the differences between the two. Of course, many apps are still 32 bits only, and in the case of WorldBench, it will only run in 32-bit Windows, so we have quite a few scores in the review that use only 32-bit code on all processors, as well.

Notice that we’ve tested a couple of Turion 64 processors with similar nomenclature. The Turion 64 ML-40 runs at 2.2GHz and is rated at 35W TDP. The lower power Turion 64 MT-40 runs at this same clock speed, but it’s rated at only 25W TDP. Performance between the two should be essentially identical, but power consumption should differ. We’ll test to see how much difference there really is.

Finally, please note that the two Pentium D 900-series processors in our test are actually a Pentium Extreme Edition 955 chip that’s been set to the appropriate core and bus speeds and had Hyper-Threading disabled in order to simulate the actual products. Performance should be identical to the real McCoys.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.

Our test systems were configured like so:

Processor Pentium Extreme Edition 840 3.2GHz
Pentium D 930 3.0GHz
Pentium D 950 3.4GHz
Pentium 4 Extreme Edition 3.73GHz
Pentium Extreme Edition 955 3.46GHz
Athlon 64 X2 3800+ 2.0GHz
Athlon 64 X2 4800+
2.4GHz
Athlon 64 FX-57 2.8GHz
Athlon 64 FX-60
2.6GHz
Opteron 165 1.8GHz
Opteron 180 2.4GHz
Turion 64 ML-44 2.4GHz
Turion 64 ML-40 2.2GHz
Turion 64 MT-40 2.2GHz
Core Duo T2600 2.166GHz Pentium M 760 2.0GHz
Pentium Extreme Edition 965 3.73GHz Turion 64 MT-40 2.2GHz
System bus 800MHz (200MHz quad-pumped) 1066MHz (266MHz quad-pumped) 1GHz HyperTransport 800MHz HyperTransport 667MHz (166MHz quad-pumped) 533MHz (133MHz quad-pumped)
Motherboard Intel D975XBX Intel D975XBX Asus A8N32-SLI Deluxe MSI RS482M-IL Asus N4L-VM DH AOpen i915Ga-HFS
Board revision BQBX4500151 BQBX54500151 1.01 2.1 1.02G 48.8EAI1.02A
BQBX55201124
BIOS revision BX97510J.86A.0354.
2005.1208.1112
BX97510J.86A.0354.
2005.1208.1112
0806 080012 0303 10/18/2005
BX97510J.86A.0669.
2006.0301.1046
North bridge 975X MCH 975X MCH nForce4 SLI X16 RS482 945GM MCH 915G MCH
South bridge ICH7R ICH7R nForce4 SLI SB400 ICH7-M ICH6
Chipset drivers INF Update 7.2.2.1006
Intel Matrix Storage Manager 5.5.0.1035
INF Update 7.2.2.1006
Intel Matrix Storage Manager 5.5.0.1035
SMBus driver 4.5
IDE/SATA driver 5.52
SMBus driver 5.10.1000.5 INF Update 7.2.2.1006
Intel Matrix Storage Manager 5.5.0.1035
INF Update 7.2.2.1006
Memory size 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs) 2GB (2 DIMMs)
Memory type Crucial Ballistix PC2-8000
DDR2 SDRAM
at 800MHz
Crucial Ballistix PC2-8000
DDR2 SDRAM
at 800MHz
Crucial PC3200
DDR SDRAM
at 400MHz
Crucial PC3200
DDR SDRAM
at 400MHz
Crucial Ballistix PC2-8000
DDR2 SDRAM
at 667MHz
Crucial Ballistix PC2-8000
DDR2 SDRAM
at 533MHz
CAS latency (CL) 4 4 2.5 2.5 3 3
RAS to CAS delay (tRCD) 4 4 3 4 4 3
RAS precharge (tRP) 4 4 3 4 4 3
Cycle time (tRAS) 15 15 8 12 10 10
Hard drive Maxtor DiamondMax 10 250GB SATA 150
Audio Integrated ICH7R/STAC9221D5
with SigmaTel 5.10.4825.0 drivers
Integrated ICH7R/STAC9221D5
with SigmaTel 5.10.4825.0 drivers
Integrated nForce4/ALC850
with Realtek 5.10.0.5970 drivers
Integrated SB400/ALC655
with Realtek 5.10.0.5970 drivers
Integrated ICH7-M/ALC882M with Realtek 5.10.00.5188 drivers Integrated ICH6/ALC880
with Realtek 5.10.00.5188 drivers
Graphics GeForce 7800 GTX 512 PCI-E with ForceWare 81.98 drivers
OS Windows XP Professional x64 Edition
Windows XP Professional with Service Pack 2 (WorldBench only)
Windows XP Professional x64 Edition
Windows XP Professional with Service Pack 2 (WorldBench only)
Windows XP Professional
with Service Pack 2
Windows XP Professional
with Service Pack 2

Thanks to Crucial for providing us with memory for our testing. Their products and support are both far and away superior to generic, no-name memory.

Also, all of our test systems were powered by OCZ PowerStream 520W power supply units. The PowerStream was one of our Editor’s Choice winners in our latest PSU round-up.

The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following 32-bit test applications for both 32-bit and 64-bit systems:

For 64-bit systems, we used the following test app versions:

For 32-bit systems, we used these:

The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Memory performance
In the results below and throughout much of the article, I’ve highlighted the Core Duo in bright colors and the rest of the mobile processors in slightly duller colors, so they’re easier to pick out of the crowd.

With their slower bus speeds and more conservative memory subsystems, the mobile processors have less peak memory throughput than their desktop counterparts. However, the Core Duo is at the top of the bunch of mobile processors, limited mainly by its 667MHz front-side bus.

Our version of Linpack isn’t highly optimized for scientific computing, but it can give us a nice look at the “shape” of the L1 and L2 caches on these processors. The Core Duo’s combination of L2 cache bandwidth, cache size, and floating-point math prowess leads to the highest sustained throughput of any of these CPUs. Also, Core Duo’s cache sharing is at work. Although our single-threaded Linpack test runs on just one CPU core, the Core Duo’s effective L2 cache size shows up here as 2MB; performance doesn’t drop off at larger matrix sizes as it does on the Athlon 64 FX-60, for instance, which has a 1MB L2 cache per core.

Despite its fast L2 cache and relatively high memory bandwidth for a mobile CPU, the Core Duo’s memory access latencies are the highest of the group. The Turion 64, with a single channel of DDR400 memory hanging off of its on-chip memory controller, has access latencies that are half those of the Core Duo. The results of these synthetic tests don’t directly dictate real-world performance, though, and Intel has taken steps to mask memory access latencies in the Duo.

 

F.E.A.R.
We tested F.E.A.R. by manually playing through a specific point in the game five times for each CPU while recording frame rates using the FRAPS utility. Each gameplay sequence lasted 60 seconds. This method has the advantage of simulating real gameplay quite closely, but it comes at the expense of precise repeatability. We believe five sample sessions are sufficient to get reasonably consistent and trustworthy results. In addition to average frame rates, we’ve included the low frames rates, because those tend to reflect the user experience in performance-critical situations. In order to diminish the effect of outliers, we’ve reported the median of the five low frame rates we encountered.

We played F.E.A.R. with both CPU and graphics performance options set to the game’s built-in “High” settings.

Above the following benchmark graph, and throughout most of the tests in this review, we’ve included a Task Manager plot showing CPU utilization. These plots were captured on the Pentium Extreme Edition 955 system, and they should offer some indication of how much impact multithreading has on the operation of each application. Single-threaded apps may sometimes show up as spread across multiple processors in Task Manager, but the total amount of space below all four lines shouldn’t equal more than the total area of one square if the test is truly single-threaded. Anything significantly more than that is probably an indication of some multithreaded component in the execution of the test. Because WorldBench’s tests are entirely scripted, however, we weren’t able to capture Task Manager plots for them.

The Core Duo looks really good here, but don’t jump until you’ve seen all of the results. Notice how much quicker the Turion 64 ML-44 is when running 32-bit Windows instead of 64-bit Windows. F.E.A.R. somehow doesn’t run as well on the 64-bit OS, which may be giving the Core Duo a bit of an unfair advantage. This is unusual behavior—most apps don’t run much better or worse in 64 bits—but we should note it.

Battlefield 2
We used FRAPS to capture BF2 frame rates just as we did with F.E.A.R. Graphics quality options were set to BF2’s canned “High” quality profile. This game has a built-in cap at 100 frames per second, and we intentionally left that cap enabled so we could offer a faithful look at real-world performance.

Here the tables are turned, and the Turion 64 ML-44 performs slightly better in the 64-bit OS. In spite of its possible handicap, the Core Duo again looks pretty strong, finishing just ahead of Intel’s fastest desktop processor, the Pentium Extreme Edition 965.

Unreal Tournament 2004
We used a more traditional recorded timedemo for testing UT2004, but we tried out two versions of the game, the original 32-bit flavor and the 64-bit version.

Here again, the Core Duo comes out before the Extreme Edition 965 and the X2 3800+. The Core Duo keeps pace with the Turion 64 ML-44, thus delivering more performance per clock than the 2.4GHz Turion.

 

3DMark05

New graphics drivers with optimizations for multi-core processors have allowed the dual-core AMD processors to push ahead of their single-core counterparts here, and so it is for the Core Duo. The T2600 scores higher than any Turion, surpasses the Athlon 64 X2 3800+, and positively womps the Pentium Extreme Edition 965.

As the Task Manager plots show, 3DMark05’s CPU tests are multithreaded, using the CPU to perform vertex processing work. The Pentium Extreme Edition processors get a chance, for once, to take the lead here. The Core Duo easily outclasses the other mobile chips, though, and edges out the Athlon 64 FX-57 in the composite CPU test.

 

WorldBench overall performance
WorldBench uses scripting to step through a series of tasks in common Windows applications and then produces an overall score for comparison. More impressively, WorldBench spits out individual results for its component application tests, allowing us to compare performance in each. We’ll look at the overall score, and then we’ll show individual application results alongside the results from some of our own application tests.

The Core Duo again proves to be Intel’s fastest processor in the overall WorldBench score. Impressively, the T2600 finishes well ahead of any other mobile CPU, as well, and only five points behind the Athlon 64 FX-57.

Image processing

Adobe Photoshop

The Core Duo is about as good as they come for running Photoshop, which should please the Macophiles out there immensely. Only the fastest Athlon 64s available can beat it.

ACDSee PowerPack

The T2600 ends up mid-pack in the ACDSee test, but it’s again ahead of any other mobile processor.

 

Audio editing and encoding

LAME MP3 encoding
LAME MT is, as you might have guessed, a multithreaded version of the LAME MP3 encoder. LAME MT was created as a demonstration of the benefits of multithreading specifically on a Hyper-Threaded CPU like the Pentium 4. You can even download a paper (in Word format) describing the programming effort.

Rather than run multiple parallel threads, LAME MT runs the MP3 encoder’s psycho-acoustic analysis function on a separate thread from the rest of the encoder using simple linear pipelining. That is, the psycho-acoustic analysis happens one frame ahead of everything else, and its results are buffered for later use by the second thread. The author notes, “In general, this approach is highly recommended, for it is exponentially harder to debug a parallel application than a linear one.”

We have results for different versions of LAME MT from different compilers, one from Microsoft and one from Intel, doing two different types of encoding, variable bit rate and constant bit rate. We also have 32-bit and 64-bit executables from each compiler. We are encoding a massive 10-minute, 6-second 101MB WAV file here, as we have done in many of our previous CPU reviews.

Yikes. With a multithreaded version of LAME, the Core Duo encodes audio faster than any other PC processor. Only the thousand-dollar Athlon 64 FX-60 comes close.

MusicMatch Jukebox

Even when it’s weak, it’s not bad. Here the Core Duo finishes mid-pack, a little behind the Turion 64 ML-44.

 

Video editing and encoding

Windows Media Encoder x64 Edition Advanced Profile
We asked Windows Media Encoder to convert a gorgeous 1080-line WMV HD video clip into a 320×240 streaming format using the Windows Media Video 8 Advanced Profile codec.

Things aren’t quite what they seem at first here. The 64-bit version of Windows Media Encoder appears to be rather slow, judging by the Turion 64 ML-44’s much better performance in 32-bit Windows. The Core Duo is still very quick, but I suspect its potency is somewhat exaggerated in this field by the relative slowness of the 64-bit version of WME.

Windows Media Encoder

Here’s another WME test that uses all 32-bit executables, and the Core Duo looks strong, but drops behind the very fastest desktop processors from AMD and Intel.

Adobe Premiere

In Premiere, the Duo finishes over 50 seconds ahead of any Intel desktop processor.

VideoWave Movie Creator

The Duo drops back a few spots in VideoWave, but it’s still well ahead of the mobile competition.

 

Multitasking and office applications

MS Office

Mozilla

Mozilla and Windows Media Encoder

The Office and Mozilla/WME tests are WorldBench’s two tests that involve true simultaneous multitasking. The Duo struggles in the Office test, although the margin of difference between the various competitors is fairly small. The Duo T2600 matches up nicely in the Mozilla web browsing test, and it performs quite well in the Mozilla/WME multitasking test.

 

Other applications

Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

This is a wholly single-threaded app, so performance is dictated by what a single core can do. This test has historically favored processors with fast memory subsystems and large L2 caches with aggressive prefetch algorithms. The Core Duo has those traits, and it basically matches the Athlon 64 FX-60.

WinZip

Nero

In WinZip, the Core Duo again outruns its Intel siblings and nearly catches the fastest AMD desktop chips. Nero is more bound by disk controller performance, and the N4L-VM DH’s inability to work properly with Native Command Queuing may be hurting the Duo’s performance here.

 

Cinebench 2003
Cinebench measures performance in Maxon’s Cinema 4D modeling and rendering app. We have both 32- and 64-bit versions of this one.

Cinebench makes good use of 64-bit extensions to achieve shorter render times, but the Core Duo can’t take advantage of that.

The rest of Cinebench’s tests are graphics shading exercises, largely single-threaded, and the Core Duo can’t quite keep up with the Turions in the software-based ones. Once the OpenGL hardware acceleration kicks in, though, the T2600 produces a record score, as all of the 32-bit-only contenders end up near the top of the pack.

 

POV-Ray rendering
POV-Ray has both 32- and 64-bit binaries, and thanks to the nifty SMPOV distributed rendering utility, we’ve been able to make it multithreaded, as well. SMPOV spins off any number of instances of the POV-Ray renderer, and it will divvy up the scene in several different ways. For this scene, the best choice was to divide the screen horizontally between the different threads, which provides a fairly even workload.

We considered using the new beta of POV-Ray with native support for SMP, but it proved to be very, very slow. We’ll have to try it again once development has progressed further.

Two cores is a nearly sure recipe for success in something as parallelizable as rendering. The T2600 finishes neck and neck with the Pentium Extreme Edition 955, well ahead of any other mobile chip and also ahead of the fastest single-core desktop processors.

3dsmax 7 rendering
We tested 3ds max performance by rendering 20 frames of a sample scene at 320×240 resolution. This particular scene makes use of a motion-blur effect that requires extensive multi-pass rendering. We tried two different renderers: 3ds max’s default scanline renderer and its built-in version of the mental ray renderer.

The Core Duo excels in 3ds max rendering, as well, surpassing the Extreme Edition 965 with both renderers.

 

SiSoft Sandra
Next up is SiSoft’s Sandra system diagnosis program, which includes a number of different benchmarks. The one of interest to us is the “multimedia” benchmark, intended to show off the benefits of “multimedia” extensions like MMX and SSE/2. According to SiSoft’s FAQ, the benchmark actually does a fractal computation:

This benchmark generates a picture (640×480) of the well-known Mandelbrot fractal, using 255 iterations for each data pixel, in 32 colours. It is a real-life benchmark rather than a synthetic benchmark, designed to show the improvements MMX/Enhanced, 3DNow!/Enhanced, SSE(2) bring to such an algorithm.

The benchmark is multi-threaded for up to 64 CPUs maximum on SMP systems. This works by interlacing, i.e. each thread computes the next column not being worked on by other threads. Sandra creates as many threads as there are CPUs in the system and assigns [sic] each thread to a different CPU.

The “Integer x16” version of this test uses integer numbers to simulate floating-point math. The floating-point version of the benchmark takes advantage of SSE2 to process up to eight Mandelbrot iterations at once.

The Duo cranks through the integer test with ease, finishing just ahead of the FX-60. When it comes to the floating-point version of this test, though, the Duo comes back to earth a bit, practically tying the Athlon 64 X2 3800+—still not bad company to keep.

 

Power consumption
We measured the power consumption of our entire test systems, except for the monitor, at the wall outlet using a Watts Up PRO watt meter. The test rigs were all equipped with OCZ PowerStream 520W power supply units. The idle results were measured at the Windows desktop, and we used SMPOV and the POV-Ray renderer to load up the CPUs. In all cases, we asked SMPOV to use the same number of threads as there were CPU front ends in Task Manager—so four for the Pentium XE 965, two for the Core Duo, and so on.

The graphs below have results for “power management” and “no power management.” That deserves some explanation. By “power management,” we mean SpeedStep, PowerNow, or Cool’n’Quiet. In the cases of the Pentium XE 840 and the Pentium XE 965, the C1E halt state is always active, even in the “no power management” tests. The Extreme Edition 955 and the P4 Extreme Edition 3.73GHz don’t support the C1E halt state or SpeedStep. We have omitted the Pentium D 930 and 950 processors here because we don’t have actual samples of these individual chips; our “simulated” versions with an underclocked Extreme Edition 955 are fine for performance testing, but not for power consumption.

The MSI motherboard we used to test the Turion 64 chips didn’t set the voltage correctly, so we had to improvise. In order to assure that these tests were accurate, we manually set the voltage supplied to the processors using the RMClock utility. The ML series Turions were set at a peak voltage of 1.35V, while the MT-40 was set at 1.2V. All of the Turions were set to 0.9V when throttled down to their lowest possible clock speeds via PowerNow.

The Core Duo’s deep sleep states help it to hit the lowest idle power use of any of our systems when SpeedStep is enabled. Once the load is cranked up, the Duo T2600 produces jaw-dropping results, with the entire system drawing only 125W while rendering in POV-Ray. And that’s while using a honkin’ GeForce 7800 GTX graphics card that’s contributing significantly to all of these systems’ power draw, even if it is largely idle during our tests.

Under load, the Core Duo is running two threads here and producing an entire frame in roughly half the time of the Turion 64 or the Pentium M, yet its power consumption is nearly the same as the other mobile systems’. That works out to about twice the performance per watt from the Core Duo as from the Pentium M 760 or the Turion 64. Incredible. Intel has pulled off the same sort of feat that AMD did when going from the Athlon 64 to the X2 (witness the FX-57 and X2 4800+ numbers above), but Intel has done it inside of a much smaller power envelope.

For the record, notice that the system based on the Turion 64 MT-40 chip draws about 10W less than the system based on the ML-40. That’s the same 10W delta AMD has assigned between those two CPUs’ TDP ratings, although I’d have expected the difference to be exaggerated at the wall socket by power supply inefficiencies and the like. At any rate, the Turion 64 MT-40 gives the Pentium M 760 one heck of a run for its money in terms of overall performance and performance per watt, probably even more so than the ML-44 did in our recent comparison.

 
Conclusions
Based on what we’ve seen, one can’t help but conclude the Core Duo’s performance per watt is unmatched in the world of PC processors. The Core Duo is obviously the best mobile CPU on the market, more than doubling the peak performance of the Pentium M while operating in the same power envelope. What’s more shocking is the fact that the Core Duo T2600’s outright performance is easily superior to Intel’s supposed flagship desktop processor, the Pentium Extreme Edition 965. Given its performance, the Core Duo is clearly well-suited for desktop use, where performance is king but quiet computing is still a blessing. Not only that, but at under $700, the Core Duo T2600 costs less than the Extreme Edition 965. The lesser models are more affordable and a better value, such as the T2400 at 1.83GHz that might tempt us away from one of our favorites in that price range, the Athlon 64 X2 3800+. Even the Asus N4L-VM DH’s rough edges are blunted somewhat by the fact that this board isn’t your typical mobile-on-desktop prima donna; its suggested retail price is only $159.

This combination makes the Core Duo Intel’s most attractive processor for PC enthusiasts, and that proposition could become downright irresistible if Asus or somebody else can deliver a mobo and BIOS with the kind of tweaking options PC enthusiasts have come to expect. The T2600 can’t quite take the overall performance crown from the likes of the Athlon 64 FX-60 or the X2 4800+, but jeez, it’s startlingly close. If we could get the Core Duo overclocked reasonably well, it might just be able to make a run at the title of the fastest x86-compatible CPU—or at least grab a share of that title.

As it stands, the Core Duo is an excellent choice for a quiet desktop PC or a silent gaming rig, and it’s perfect for a home theater PC, where the 64-bit memory space issue isn’t likely to rear its ugly head for at least several years. Were it not for the fact that Core Duo can’t handle 64-bit addressing, I’d say Intel should transition its desktop and server product lines to this microarchitecture right now rather than waiting for Conroe, Merom, and Woodcrest.

Macophiles have to be reading these words with a certain glee, given that Apple has already transitioned several of its products to Core Duo, including the iMac. They should be pleased with the performance and power efficiency of Apple’s new chosen engine—or at least they should once universal binaries are widely available. They’ve gotta be thinking that the severe case of whiplash from the “Intel sucks”-“Intel rules” about-face was worth it. The Pentium Extreme Edition scores in this review even gives them plausible cover for the dissonance. I’m happy for them. 

Latest News

Cardano Could Rally to $27 After Bitcoin Halving if Historical Performance
Crypto News

Cardano Could Rally to $27 After Bitcoin Halving Following a Historical Performance

Japanese Banking Firm Launches Passive Income Program for Shiba Inu
Crypto News

Japanese Banking Firm Launches Passive Income Program for Shiba Inu

SBI VC Trade, the digital asset division of the prominent Japanese financial conglomerate SBI Group, has unveiled a new lending service, “Rent Coin.” The Japanese banking giant announced the recent...

Ripple CLO Clarifies Future Steps With the SEC While Quenching Settlement Rumors
Crypto News

Ripple CLO Clarifies Future Steps With the SEC While Quenching Settlement Rumors

Ripple Chief Legal Officer Stuart Alderoty recently shared some insight regarding the SEC vs. Ripple’s long-standing lawsuit. The CLO quenched the growing rumors of a settlement in the company’s case...

Cisco Launches AI-Driven Security Solution 'Hypershield'
News

Cisco Launches AI-Driven Security Solution ‘Hypershield’

Crypto analyst April top picks
Crypto News

Crypto Analyst Reveals His Top Three Investments for April

You May Soon Have to Pay to Tweet on X, Hints Musk
News

You May Soon Have to Pay to Tweet on X, Hints Musk

Pakistan Interior Ministry Bans X Over Security Concerns
News

Pakistan Bans X over Security Concerns – But The Ban Might Be Temporary