Pentium 4 Northwood
The chip code-named Northwood is Intel’s second incarnation of the Pentium 4 processor. The Pentium 4 “Northwood” isn’t fundamentally different from the original Pentium 4 “Willamette,” but there are a couple of significant changes to the chip.
First, Intel has changed the manufacturing process used to fabricate the chip. The first Pentium 4 chips were manufactured using Intel’s 0.18-micron fab process, which used conventional aluminum for the chip’s interconnects. Northwood is made on Intel’s new 0.13-micron process, which features copper interconnects with a low-K dilectric material that reduces crosstalk. Intel claims its 60-nanometer transistors are the world’s smallest and fastest in volume production, as well. The Pentium III made the conversion to this new manufacturing process a number of months ago, and the Pentium 4 is just now making the move.
This so-called die shrink does several things for the Pentium 4. Northwood is smaller, runs cooler, and requires less power than Willamette. The Pentium 4’s die size shrinks from 217 square millimeters to 145 square millimeters. Because Intel can fit more chips on a wafer, Northwood should be cheaper to manufacture. The process shrink should also enable Northwood to run at even higher clock frequencies with ease.
The die shrink also made room for Intel to increase the size of the Pentium 4’s on-chip level 2 cache from 256K to 512K. This extra cache takes the Pentium 4 from 42 million transistors to 55 million. The jumbo-sized L2 cache ought to help Northwood tackle the Pentium 4’s big bugaboo: low clock-for-clock performance. A larger cache should help keep the P4’s deep instruction pipelined fed, increasing the number of instructions per clock (IPC) the chip can execute.
Intel is introducing Northwood at two initial clock speeds: 2.0GHz and 2.2GHz. In order to differentiate the Northwood 2GHz from the older Pentium 4 “Willamette” 2GHz, Intel is calling the Northwood 2GHz the “Pentium 4 processor at 2.0 ‘A’ GHz.” The “A” designation will conjure up warmly remembered visions of the Celeron 300A for old-timers like me, while the rest of you will probably be wondering why Intel couldn’t come up with a better name than “2.0 ‘A’ GHz.”
The Athlon XP 2000+
The Athlon XP 2000+ is simply AMD’s latest speed ramp of the Athlon XP. Like all Athlon XPs, this new one gets a model number that’s independent of its clock speed. The previous top speed for the Athlon XP was the 1900+ model, which runs at 1.6GHz. (We reviewed the 1900+ here.) The Athlon XP 2000+ runs at 1.67GHz.
The Athlon XP hasn’t yet undergone the die shrink to 0.13 microns. Like Northwood and unlike Willamette, however, that Athlon XP is made with copper interconnects, which AMD has been using on Athlon chips for quite some time now. AMD has plans to take the Athlon line to 0.13 microns this quarter; that chip is code-named Thoroughbred. However, even without the die shrink, the Athlon XP is only 128 square millimeters. Because Athlon XPs are made up of only 37.5 million transistors, they’re much smaller than the Pentium 4even smaller than die-shrunk Northwood. All other things being equal, Athlon XPs ought to be cheaper to make, as well.
Don’t be fooled by the Athlon XP’s relatively pokey 1.67GHz clock speed. There’s a reason AMD puts that model number label on its CPUs; they perform quite a bit better, clock for clock, than the Pentium 4.
What to watch for in the test results
I’ll tell you now that this is going to be a very tight contest, so it’s important to keep the relative benchmark scores in perspective. All of the processors we’re testing today are exceptionally fast, so they’re often held back by other components in our test system, like memory, video cards, or hard drives. Not only that, but the processors themselves perform quite similarly, which should be no great shock given the healthy competition right now between Intel and AMD. All told, you’ll see a lot of tests where the results are within a few percentage points of one anotheror less.
Although we run our tests multiple times and average the results in order to limit variability, many of these results are close enough that the differences may not matter. Either the variance between the results is within the margin of error, or, more commonly, the real-world difference between one score and another is negligible. Keep that in mind.
That’s not to say that none of the differences matter. They often do. Some of the performance differences are rather pronounced. And every bit of performance counts, especially in a grudge match like this one.
That said, there are a few interesting matchups here. For starters, you’ll want to keep an eye on how the Pentium 4 Willamette 2GHz stacks up against the 2GHz Northwood. The Northwood ought to be faster in many tests thanks to its larger L2 cache, but in other places, that extra cache may not help much. Some software routines won’t fit into a 256K L2 cache, but they’ll fit fine into Northwood’s 512K L2 cache. Those routines should run faster on Northwood.
Next, we’ve tested the Pentium 4 chips with both DDR SDRAM and RDRAM. These two types of RAM are vying for supremacy on the Pentium 4 platform, and the odds are very good that DDR SDRAM will win that battle in terms of sales. You may want to keep an eye on how those two types of memory perform.
Finally, there’s the main event: the Athlon XP 2000+ versus the 2.2GHz Northwood. Can the Northwood’s extra speed and cache help the Pentium 4 finally overcome the Athlon? We’ll see.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Athlon XP||Pentium 4 DDR||Pentium 4 RDRAM|
|Processor||AMD Athlon XP 1800+
AMD Athlon XP 2000+
|Intel Pentium 4 2.0GHz
Intel Pentium 4 2.0″A”GHz
Intel Pentium 4 2.2GHz
|Intel Pentium 4 2.0GHz
Intel Pentium 4 2.0″A”GHz
Intel Pentium 4 2.2GHz
|Front-side bus||266MHz (133MHz double-pumped)||400MHz (100MHz quad-pumped)||400MHz (100MHz quad-pumped)|
|Motherboard||Epox EP-8KHA+||Abit BD7-RAID||Intel D850MD|
|Chipset||VIA KT266A||Intel 845||Intel 850|
|North bridge||VT8366A||82845 MCH||82850 MCH|
|South bridge||VT8233||82801BA ICH2||82801BA ICH2|
|Memory size||256MB (1 DIMM)||256MB (1 DIMM)||256MB (2 RIMMs)|
|Memory type||Micron PC2100 DDR SDRAM||Micron PC2100 DDR SDRAM||Samsung PC800 Rambus DRAM|
|Graphics||NVIDIA GeForce3 Ti 500 64MB (Detonator XP 21.83 video drivers)|
|Sound||Creative SoundBlaster Live!|
|Storage||IBM 75GXP 30.5GB 7200RPM ATA/100 hard drive|
|OS||Microsoft Windows XP Professional|
The test systems’ Windows desktops were set at 1024×768 in 32-bit color at a 75Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- SiSoft Sandra Standard 2001te
- Compiled binary of C Linpack port from Ace’s Hardware
- ZD Media Business Winstone 2001 1.0.2
- ZD Media Content Creation Winstone 2001 1.0.2
- POV-Ray for Windows version 3.1g (multiple compiles)
- Sphinx 3.3
- ScienceMark 1.0
- LAME 3.90
- SPECviewperf 6.1.2
- MadOnion 3DMark 2001 Build 200
- Quake III Arena 1.30
- Serious Sam v1.05
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
We’re going to start off with memory tests because, well, that’s how we generally start things off. Also, these tests are a little more theoretical than the rest, so we’ll get them out of the way before we move on to the real contest.
First up is the modified version of the Stream memory benchmark that’s included in SiSoft’s Sandra. This test measures memory bandwidth, which is one component of memory performance.
The results break down nicely into three separate groups. As expected, the RDRAM systems are fastest here. The Pentium 4 DDR systems are next, and you can see there’s no difference between Willamette and Northwood here; they’re all talking to the same memory over the same bus, so the memory bandwidth is nearly identical. Finally, the Athlon XP systems can’t transfer quite as much data to memory as the Pentium 4 systems. On some memory-intensive tasks, the Pentium 4 will have the advantage.
However, that’s only half the story. As you can see on this page of our recent chipset review, RDRAM memory’s extra bandwidth comes at a price of higher memory latencies. DDR-based systems are much quicker accessing memory in smaller chunks, which helps them compare well against RDRAM-based systems despite the bandwidth disparity.
The more interesting test here is Linpack, which can give us a nice visual look at Northwood’s L2 cache in action. Here’s how the results look:
If you’re not familiar with a Linpack graph, watch closely. The X axis is the size of the data matrix Linpack is processing, and the Y axis is the calculation speed measured in megaflops. If data fits into a processor’s cache, the CPU can process that data much faster. As the size of the data matrix grows, the calculations will get progressively slower.
This graph shows us several things. First, you can see that Northwood’s L2 cache is quite a bit larger than Willamette’s. Willamette’s performance begins to drop off once we get into matrices of about 192K in size, while Northwood peaks at about 384K. Not only that, but the extra cache helps Northwood’s peak performance climb much higher than Willamette’s can.
Next, notice that the Athlon XP’s effective cache size is greater than 256K. Although the Athlon XP has a 256K L2 cache, its L2 cache doesn’t replicate the contents of the L1 data cache like the Pentium 4’s does. You can even see that the Athlon XP’s 64K L1 data cache is much faster than its L2 cache. The Athlon XP’s exclusive L2 cache gives it an effective cache size of 320K. However, the Athlon XP’s L2 cache is measurably slower than Northwood’s.
Now the intriguing bit: the Athlon XP shows us all of its 256K L2 and 64K L1 data cache in Linpack. Performance doesn’t drop off sharply until the matrix size hits 320K. The Northwood, however, peaks at about 384Kwell below its 512K L2 cache size. I expect the difference here has something to do with the way these two chips manage their respective caches.
Business Winstone 2001
Business Winstone has been around forever, and this latest version still does a decent job showing us how a system performs in common office applications.
Did I mention that we had some very close results? The Northwood at 2.2GHz is fastest all around, but only when paired with RDRAM. Of the DDR-equipped systems, the Athlon XP 2000+ is fastestbut just barely.
Content Creation Winstone 2001
Business Winstone’s companion test is a little more intensive. It measures performance in applications like audio editing, page design, and image processing.
Here the Athlon XP takes a decisive lead. Northwood is just a tick faster than Willamette, though.
POV-Ray 3D rendering
POV-Ray is a freeware software ray-tracing program that creates high-quality 3D scenes. It’s also a very useful measure of a processor’s performance, particularly on floating-point math. Our POV-Ray tests use the original release of POV-Ray 3.1, plus Steve Schmitt’s recompiled versions, just to see what difference the various compilers and compiler settings can make.
This time out, we’re using an updated version of Steve Schmitt’s recompiled POV-Ray. Although there two flavors of recompiled POV-Ray, including one specifically optimized for the Pentium 4, we’re only using the generic “PIII” version, which runs fine on both the Athlon and the Pentium 4. Unfortunately, some folks have reported getting buggy output from the P4-specific binary, so we’ll have to skip it.
The Athlon XP dominates in POV-Ray, finishing the render a full 80 (and a half) seconds before the 2.2GHz Northwoodand the gap’s over two minutes with the unoptimized binary. Athlons have always excelled in floating-point math, so this result is not a big surprise.
LAME MP3 encoding
LAME is the encoder of choice around Damage Labs for high-quality output, so this test holds some interest for me. More speed for MP3 encoding is always good.
It’s mighty close yet again, but the Athlon XP 2000+ comes out on top.
Quake III Arena
The crown for Quake III performance changed hands when we tested the Athlon XP 1900+ a while back. Before then, Quake III was definitely Pentium 4 territory. Can Northwood recapture the crown?
Most definitely. Three different Northwood configurations outpace the Athlon XP 2000+. However, the older Willamette 2GHz can’t beat the Athlon XP 1800+.
If Quake III seems old and musty to you, Serious Sam ought to be more up your alley.
The Athlon XP takes this one in a walk. Serious Sam has always run especially well on an Athlon, and the newest, fastest Athlon XP is no exception.
This test’s top spot has changed hands more often than most Euro notes. It seems like every time a new NVIDIA driver, chipset, or processor hits the streets, we’ve got a new 3DMark leader. The Pentium 4 2GHz held the lead last time out. Can the Athlon XP pull into the lead?
Not exactly. Northwood at 2.2GHz is faster, but it’s extremely close overall. The big story is how much faster Northwood is than Willametteabout 500 points at the same clock speed.
SPECviewperf workstation graphics
Viewperf measures performance in workstation-class 3D applications like CAD/CAM and 3D modeling tools. Some of these tests are limited almost entirely by our GeForce3 graphics card, but a few of them are still interesting.
This should come as no surprise: in those tests where the graphics card isn’t the performance bottleneck, it’s a toss up. The Athlon XP is faster in some, and the Pentium 4 systems are faster in others.
The Sphinx speech recognition tests came to us via Ricky Houghton, who works in a speech recognition effort at Carnegie Mellon University. They’re based on Sphinx 3.3, which is an advanced system that promises greater accuracy in speech recognition. However, our past tests have shown that Sphinx 3.3 still can’t quite run fast enough on a standard PC to handle tasks in real time; it seems to be limited primarily by memory bandwidth, but faster CPUs do help performance, as well.
What we’re after here is for our speech recognition test to execute faster than real time, which would help make Sphinx 3.3 workable in real-world applications. For a while now, I’ve hoped that Northwood might take us past that threshold.
Unfortunately, not even Northwood at 2.2GHz can take us into the Promised Land. Regardless, all of these processors are very close. Maybe we can make it happen yet with a little tweaking, eh?
On to Tim Wilkens’ computational benchmark, ScienceMark. This suite of tests measures number-crunching ability by running some computationally intensive scientific equations. Like 3DMark, ScienceMark then spits out a composite number denoting a system’s overall score in the suite.
The Athlon XP 2000+ is an absolute monster in scientific computingGodzilla in a lab coat. However, the individual tests show the Pentium 4’s strength, as well.
The Athlon XP is fastest in the QMC and Liquid Argon tests, but Primordia is dominated by the Pentium 4.
As you’ve already seen, the Pentium 4 2.2GHz and the Athlon XP 2000+ are the fastest x86 processors on the planet. They have achieved benchmark scores that have never been seen before. They have forever upped the ante in the performance market. Virtually any power user would be completely satisfied with the power of these just-announced chips.
Naturally, we had to try to overclock the rot out of them.
Of course, overclocking isn’t as easy as it used to be. Granted, Intel has had their multiplier lock in place for ages, but AMD always gave you a pretty easy out. With the Slot A Athlons it was the Golden Fingers cards (R.I.P.) and with the Socket A Thunderbird Athlons, a mechanical pencil was all you needed to achieve overclocking bliss.
The Athlon XP, however, changed all that. While it’s still technically possible to unlock the Athlon XP, it’s much more difficult, and as a result it’s likely that even many enthusiasts will now give up on multiplier control and pursue an easier overclocking method.
That method, of course, is bus overclocking. Many enthusiast’s boards allow for bus speed control via the BIOS. Although it is less versatile with out multiplier control, bus overclocking can still reap substantial rewards. This is because when you raise the bus speed, you’re overclocking not only the processor, but also the RAM, PCI bus, AGP bus . . . you get the idea. The downside is that when you overclock a large number of components, there’s a higher chance that one of them will hit a wall and stop your fun.
We overclocked both the 2.2GHz Pentium 4 and the Athlon XP 2000+ to the highest stable speed we could find. The Pentium 4 tests were conducted with the Abit BD7-RAID, while the Athlon XP tests used the Epox EP-8KHA+. First we’ll go over the speed increases we realized, then we’ll take a look at a subset of our earlier benchmarks, comparing “stock” speed to top stable overclocked speed.
The Pentium 4, unsurprisingly, was an overclocking beast, topping out with a top stable bus speed of 118MHz (stock is 100MHz). That gives us a processor speed of 2596MHz, nearly a 400MHz gain. The Athlon XP didn’t get quite that large a jump, but it still nailed down a pretty impressive increase of its own, going from 133MHz bus to 142MHz bus. Processor speed went from 1667MHz to 1775MHz.
Pushing the 2.2GHz Pentium 4 to the limit finally breaks the real-time barrier for the Sphinx speech recognition test. And not by a little bit, either; we can see that the increase in bus speed really allows the Northwood to stretch its legs and blast through the magic 1.0 mark. The Athlon XP, meanwhile, posts miniscule gains here. It’s likely that some other component is creating a bottleneck here, as the extra hundred or so megahertz provide hardly any benefit.
Once again, we see the Pentium 4 utilizing its increased bus speed to great effect, gaining nearly 35 frames per second. The Athlon XP does better with its increased bus, but not to the same extent. Of course, it’s important to remember that the Pentium 4 systemboth in bus speed and processor speedgained relatively more than the Athlon XP.
The trend continues in the final test, as the Pentium 4 gains a lot of performance from its overclocked bus. The Athlon XP scores are basically a draw; though the overclocked system technically scored lower, the differences are statistically insignificant.
Performance-wise, it’s a toss-up. I would like to declare one or the other of these processors the clear winner, but that’s just not possible. The Athlon XP 2000+ and Pentium 4 2.2GHz are locked in a dead heat for the title of “fastest x86 processor.”
That’s significant progress for Intel, because AMD has held an almost-constant performance lead for well over a year now. With the introduction of the 845 chipset with support for DDR memory and now Northwood, the Pentium 4 platform has finally come into its own. The die-shrunk Pentium 4 is primed for Intel to crank up the clock, and our overclocking exploits show Intel has the headroom to do so at will. The P4 platform’s high-speed bus and ample memory bandwidth will allow significant performance gains as clock speeds ramp, too.
As for AMD, they have managed to hang on to a share of the performance title even as Intel has introduced a much-improved Pentium 4. AMD’s ability to compete with Intel over the past few years has been unprecedented and impressive. However, AMD is facing significant challenges ahead. The Athlon XP will have to transition to 0.13-micron production before too long, and more importantly, the Athlon XP needs a faster system bus in order to take advantage of faster forms of memory, like DDR333. As consistently as AMD has executed on its plans, however, I find it hard to doubt they will meet these challenges. Heck, there’s probably enough headroom in the current, 0.18-micron Athlon XP for clock speeds as high as 1.8GHz.
Finally, there is the little matter of price. Intel’s pricing for the new P4 chips is like so:
Pentium 4 2.2GHz – $562
Pentium 4 2.0 “A” GHz – $364
AMD’s prices, meanwhile, are a little more modest:
Athlon XP 2000+ (1.67GHz) – $339
Athlon XP 1900+ (1.60GHz) – $269
Athlon XP 1800+ (1.53GHz) – $223
Athlon XP 1700+ (1.47GHz) – $190
Athlon XP 1600+ (1.4GHz) – $160
Obviously, the Athlon XP offers the better price-performance ratio. For enthusiasts looking to build their own PCs, the Athlon XP is probably still the way to go. For those of you looking to buy a PC from a large OEM like Gateway or Dell, it’s hard to say. AMD’s lower prices might let you get more PC for the money. However, Intel traditionally offers steep discounts to OEMs, so shop carefully.