That’s in addition to the L1 and L2 caches you’ll find on normal Pentium 4s. Not only that, but the Extreme Edition is based on older technology; the Gallatin core is derived from the Pentium 4 “Northwood” design that powered the P4’s successful run between 2GHz and 3GHz over the past couple of years. Newer P4s are based on the 90nm “Prescott” core, a substantially redesigned processor that offers higher performance in some cases and lower in others.
Intel today is launching a new version of the Pentium 4 Extreme Edition, and because it’s not based on the Prescott core, it lacks certain new features, such as support for SSE3 instructions. What it does have, however, is one very important attribute: the ability to run on a 1066MHz bus. The new P4 Extreme Edition’s partner in crime is the Intel 925XE Express chipset, a slightly tweaked version of the 925X Express chipset that also allows for a 1066MHz front-side bus, up from 800MHz. This faster bus should remove a key bottleneck, allowing the CPU to talk to dual channels of 533MHz DDR2 memory with the benefit of symmetry. (533MHz times two, you see, is 1066MHz. Clever, no?)
The big question about this new Extreme Edition processor is whether a faster bus alone can bring its performance into truly extreme territory. The new CPU runs at 3.46GHz, only up 66MHz from the 3.4GHz version that preceded it. Well, OK, 66.6666666666667MHz, to be exact. In our last CPU review, the Extreme Edition 3.4GHz was exposed like Ashlee Simpson on SNL. The regular ol’ Pentium 4 560 at 3.6GHza product that costs less than half the priceoften outperformed the Extreme Edition 3.4GHz, and AMD’s range of Athlon 64 processors absolutely embarrassed the Pentium 4s in our gaming tests. Doom 3 was playing, but the P4’s lips weren’t moving. Can the Extreme Edition’s massive cache, fast DDR2 memory, and a faster system bus restore some of the luster to Intel’s flagship product? We’re about to find out.
The new chip and set
The new P4 Extreme Edition comes in a pin-free package, ready to slide into a newfangled LGA775 CPU socket where the pins are on the motherboard. That makes the P4 Extreme Edition a fairly sturdy bit of hardware compared to most CPUs. Let’s give it the extreme close-up treatment.
If you look closely at the picture, you can see the tiny indentations in the gold contact pads on the underside of the CPU where the pins of the LGA775 socket have made contact. Funky, huh? Having used LGA775 for a while now, I’m still not 100% sure about its sturdiness over the long haul, but my initial positive impressions remain largely confirmed. Handle it carefully, and the thing just works. Bust something, and in this case, you’ve gotta replace a $200 motherboard instead of a $1000 processor. Not a bad trade-off, all things considered.
The 925XE Express core logic chipset mated to the new Extreme Edition is just a tweaked version of the 925X Express, but that’s no bad thing. Intel’s 900-series chipsets are easily the most advanced core logic silicon available today, with support for a range of new technologies and standards, including PCI Express, DDR2 memory, Intel High Definition Audio, and advanced Serial ATA storage. If you’re not already familiar with the 900-series Express chipsets, you owe it to yourself to go read our review of them. Intel has revamped large swaths of the PC platform with these chipsets. A number of companies are preparing core logic chipsets for the Athlon 64 that include support for PCI Express and some of the other new standards in the Intel 900 series, but it looks like none of them will support the full spectrum of new goodies like Intel. Also, those chipsets haven’t quite arrived yet, so the Pentium 4 is currently your only choice for PCI Express.
A bus with only one seat?
One of the most surprising things about the new Extreme Edition processor is that it stands alone as the only Pentium 4 capable of running on a 1066MHz bus. This is a unique situation for Intel; in the past, the company has traditionally rolled out a new bus speed across most of its desktop processors in one stroke. Not so here.
I was puzzled by this decision, so I asked an Intel PR rep about the rationale behind it. He said the 1066MHz bus would not make its way to regular Prescott-based Pentium 4s, at least through the end of 2004, for three reasons. First, Intel wants the faster bus speed to be a product differentiator for the Extreme Edition CPU. Second, it takes time to ramp volume on new products with faster bus speeds. Third, Intel “can’t just magically” make all CPUs work properly at a 1066MHz front-side bus.
I believe this is an example of the time-honored PR tradition of listing one’s reasons in ascending order of importance. More likely than not, the 1066MHz bus will be confined to Extreme Edition processors because not all Prescott-based P4s are happy on a higher frequency bus. The fact that the Extreme Edition is still based on an older CPU core is telling. If Intel could roll out a 1066MHz bus across its entire Pentium 4 line without too much pain, I believe the company would do so sooner rather than later.
Whatever the case, the new Extreme Edition is the only P4 with official support for a 1066MHz bus, and the very nice Intel D925XECV2 mobo wouldn’t oblige our attempts to run a Prescott CPU on a 1066MHz FSB. I expect the first wave of 925XE motherboards from Taiwan will be more obliging, opening up some interesting overclocking possibilities for lower-end P4 chips.
Anyhow, that’s for another day. For now, let’s see how the new Extreme Edition performs.
Please note that several of our test CPUs are actually underclocked versions of other products. Specifically, the Pentium 4 model 540 and 550 entries are actually our Pentium 4 560 3.6GHz engineering sample, which came with an unlocked multiplier for testing at different speeds, running at 3.2 and 3.4GHz, respectively. Similarly, the 130nm version of the Athlon 64 3500+ is a down-clocked Athlon 64 3800+, and our Athlon 64 3200+ results were achieved by testing the 90nm Athlon 64 3500+ at 2.0GHz. For most intents and purposes, save perhaps for our power consumption tests, these underclocked processors should perform just like the real McCoys.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Processor||Athlon 64 3200+ 2.0GHz (S939)
Athlon 64 3500+ 2.2GHz (90nm)
Athlon 64 3500+ 2.2GHz (130nm)
Athlon 64 3800+ 2.4GHz
Athlon 64 4000+ 2.4GHz
Athlon 64 FX-55 2.6GHz
| Pentium 4 540 3.2GHz
Pentium 4 550 3.4GHz
Pentium 4 560 3.6GHz
Pentium 4 Extreme Edition 3.4GHz
|Pentium 4 Extreme Edition 3.46GHz|
|System bus||1GHz HyperTransport||800MHz (200MHz quad-pumped)||1066MHz (266MHz quad-pumped)|
|Motherboard||Asus A8V Deluxe||Abit AA8 DuraMax||Intel D925XECV2|
|BIOS revision||1008 beta 1||1.4||CV92510A.86A.0338|
|North bridge||K8T800 Pro||925X MCH||925XE MCH|
|Chipset drivers||4-in-1 v.1.11 beta (9/7/04)||INF Update 220.127.116.112
IAA for RAID 18.104.22.16815
|INF Update 22.214.171.1242
IAA for RAID 126.96.36.19915
|Memory size||1GB (2 DIMMs)||1GB (2 DIMMs)||1GB (2 DIMMs)|
|Memory type||OCZ PC3200 EL DDR SDRAM at 400MHz||OCZ PC2 5300 DDR2 SDRAM at 533MHz||OCZ PC2 5300 DDR2 SDRAM at 533MHz|
|RAS to CAS delay||2||3||3|
|Hard drive||Maxtor MaXLine III 250GB SATA 150|
|Audio||Integrated VT8237/ALC850 with 3.64 drivers||Integrated ICH6R/ALC880 with 188.8.131.5222 drivers||Integrated ICH6R/ALC880 with 184.108.40.20632 drivers|
|InGraphics||GeForce 6800 GT 256MB AGP with ForceWare 66.81 drivers||GeForce 6800 GT 256MB PCI-E with ForceWare 66.81 drivers||GeForce 6800 GT 256MB PCI-E with ForceWare 66.81 drivers|
|OS||Microsoft Windows XP Professional|
|OS updates||Service Pack 2, DirectX 9.0c|
All tests on the Intel systems were run with Hyper-Threading enabled.
Thanks to OCZ for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, OCZ’s RAM is definitely worth considering.
The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- Cachemem 2.65MMX
- SiSoft Sandra 2004 SP2.b
- Compiled binary of C Linpack port from Ace’s Hardware
- DOOM 3 with trdelta1 demo
- Far Cry 1.2 with tr3-pier demo
- Counter-Strike Source with trdemo1
- Unreal Tournament 2004 v3323 with trdemo1
- 3DMark05 v110
- Sphinx 3.3
- LAME 3.96.1 (build from mitiok.cjb.net)
- Xmpeg 5.0.3 with DivX Video 5.21
- Cinebench 2003
- POV-Ray for Windows 3.6
- ScienceMark 2.0 beta (23SEP03 build)
- picCOLOR v4.0 build 479
- WorldBench 5.0
The tests and methods we employ are generally publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
The 1066MHz bus on the Extreme Edition ought to allow for more memory bandwidth. Does it deliver?
Yes, the P4 Extreme Edition 3.46GHz (dubbed the P4 XE 3.46GHz in our graphs) gets a nice bump in Sandra memory bandwidth. However, look at the Cachemem results, and you can see how the P4 Prescott’s more aggressive prefetching of data into the L2 cache produces better bandwidth scores, even on an 800MHz bus. This is one reason I’d like to see a Prescott with a 1066MHz bus; I think maybe the Prescott could take better advantage of it.
Linpack shows us, visually, how the P4 Extreme Edition’s copious on-chip cache dwarfs that of all competitors. With data matrix sizes up to 2MB, calculations are performed almost entirely in L2 cache, where they are very quick.
I should mention, though, that the P4 Extreme Edition’s total effective cache size is 2MB, even though the chip has an 8K L1 data cache, a 512K L2 cache, and a 2MB L2 cache. Intel’s cache hierarchy is not exclusive, so total effective cache size isn’t additive. The L2 cache mirrors the contents of the L1 data cache, and the L3 cache mirrors the contents of the L2. This arrangement is distinct from the exclusive caches on recent AMD processors, where a 128K L1 data cache and 1024K L2 cache would lead to a total effective cache size of 1152K.
The faster bus on the Extreme Edition shows a small but measurable improvement in our memory access latency benchmark. Let’s look at this effect in more detail with our fancy 3D graphs.
Here’s a slightly indulgent look at memory access latencies in more detail. If the following intimidates you, just skip to the next page with the gaming results. Remember, though, to flip back here if the boss is looking over your shoulder.
I’ve colored the data series below according to how they correspond to different parts of the memory subsystem. Yellow is L1 cache, light orange is L2 cache, and orange is main memory. The red series, if present, represents L3 cache. Of course, caches sometimes overlap, so the colors are just an interesting visual guide.
The new Extreme Edition takes less time getting to main memory than any of the other Pentium 4 chips, as one might expect. That allows it to close the gap slightly with the Athlon 64, but the AMD chips’ on-die memory controller is very tough to beat.
Let’s get right down to the gaming results now with Doom 3. We tested using a custom-recorded demo that should be fairly representative of most of the single-player gameplay in Doom 3.
The faster bus gives the P4 Extreme Edition a boost, but it’s only enough to bring it up to speed with the Athlon 64 3200+, not nearly enough to make it competitive at the high end.
Our Far Cry demo takes place on the Pier level, in one of those massive, open outdoor areas so common in this game. Vegetation is dense, and view distances can be very long.
Far Cry’s the same tune in a different key. The new Extreme Edition is a little bit faster, but nothing like it needs to be in order to catch the Athlon 64s.
This is a final, release version of Counter-Strike: Source that we’re using, available for purchase via Valve’s Steam distribution system. Our demo game takes place on the cs_italy map.
The new bus yields an extra 3 frames per second, on average, in CS: Source, but that little boost is enough to move the Extreme Edition ahead of the A64 3500+.
Unreal Tournament 2004
Our UT2004 demo shows yours truly putting the smack down on some bots in an Onslaught game.
The move to a 1066MHz bus doesn’t help much in a UT2004 timedemo benchmark.
However, could the picture change during actual gameplay? Some folks from Intel have suggested to us that we should consider testing gameplay performance with the FRAPS frame rate capture program instead of relying on an in-game benchmarking function. The suggestion makes some sense, because timedemo playback tools don’t always use every aspect of the game engine, such as physics, A.I., and user input routines.
I tried using FRAPS with a couple of games, including Doom 3 and Rome: Total War, but frame rate caps in those games prevented us from being able to show meaningful performance differences between different processors. UT2004, which is very much a CPU-bound game, was a different story. The results below are the averaged from five different 150-second gaming sessions played on the same Onslaught map as in our timedemo above, ONS-Torlan. I was playing against computer-controlled bots, so UT2004’s A.I. was working overtime.
Somewhat surprisingly, the results track very, very closely with the frame rates in our timedemo above. The Extreme Edition 3.46GHz still produces a lower average frame rate than the A64 3500+, but the all-important frame rate minimum is higher on the Extreme Edition, so it probably produces a slightly smoother gaming experience overall.
On a wholly subjective front, I noticed a couple of differences between CPUs while playing through these Onslaught matches. One, playing with the Athlon 64 FX-55 was smooth as glass. I could feel the difference between it and the processors in the middle of the pack. Two, the lower end processors, especially the lower speed P4 Prescotts, felt a little sluggish at pointsnothing too distracting, but the game was a little choppy sometimes.
Before we move on, we tried one more thing with UT2004. We tested CPU performance using its software renderer, just to see what would happen.
For software rendering, the P4 Extreme Edition is tops.
The 1066MHz bus allows the Extreme Edition 3.46Ghz to perform quite a bit better in 3DMark05’s two CPU tests, bringing its overall CPU score up above the Athlon 64 3800+.
WorldBench overall performance
WorldBench uses scripting to step through a series of tasks in common Windows applications. Also like those benchmarks, WorldBench produces an overall score for comparison. More impressively, WorldBench spits out individual results for its component application tests, allowing us to compare performance in each. We’ll look at the overall score, and then we’ll show individual application results alongside the results from some of our own application tests.
Overall, the P4s perform better, in relative terms, in Worldbench’s suite of applications than they did in our gaming tests. The new Extreme Edition just outdoes the Athlon 64 3800+.
Audio editing and encoding
LAME MP3 encoding
We used LAME to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:
lame –alt-preset extreme file.wav file.mp3
In both apps, the Extreme Edition 3.46GHz is among the fastest processors for audio encoding.
Video encoding and editing
XMPEG DivX video encoding
We used the default settings for the DivX codec to encode a 3000-frame sequence from a DVD-formatted MPEG2 source file.
Windows Media Encoder
VideoWave Movie Creator
The P4 XE 3.46GHz is at or near the top in three of our four video tests, but it’s near the bottom in Adobe Premiere video editing, oddly enough. Still, the P4 XE’s overall video encoding and editing performance is excellent.
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his picCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. We’re using a new build of picCOLOR this time out; it removes the video tests, which are highly dependent on the chipset and video card, from the calculation of the overall score.
The P4 XE 3.46GHz is middle of the pack, all told, in our image processing tests. The P4 Prescott 3.6GHz processor outdoes the XE 3.46GHz in ACDSee and picCOLOR, and the Athlon 64 FX-55 takes the top spot in all three apps.
Multitasking and office applications
WorldBench’s MS Office test runs a the various components of the Office suite simulataneously and switches between them, as most users would. The Pentium 4 CPUs surpass the Athlons here, probably in part thanks to the Hyper-Threading capabilities of the Pentium 4.
Mozilla and Windows Media Encoder
The Athlon 64s easily outrun the Pentium 4s in the Mozilla test, but when you add a Windows Media Encoder session to the mix for some intensive multitasking, the P4 XE comes roaring back to take the second spot, just a hair behind the FX-55.
Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.
Sphinx is a case where the Prescott’s design tweaks overcome the Extreme Edition’s additional cache and faster bus speed. The P4 XE is fairly competitive here, but it’s slower than its cheaper cousins.
WinZip is like Sphinx in that the Prescott chips do relatively well here The XE 3.46GHz is no slouch, but it takes third in a tough field.
The Nero test is all about the disk controller, and in this case, the Intel ICH6 chip handily beats the VIA disk controller on our Athlon 64 test rig. CPU speed, as you can see, isn’t really a factor here.
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration. Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading, as you can see in the results.
The P4 XE 3.46GHz just edges out its predecessor in Cinebench, taking top honors.
The shading tests are little rougher going for the Extreme Edition. The Prescotts do best in Cinema 4D shading, and Athlon 64s capture the top three spots in the OpenGL shading benchmarks.
We have used 3ds max in the past for CPU testing, but most of those tests have consisted of rendering only. WorldBench’s 3ds max tests replicate an entire modeling and animation work session, stressing the graphics card as well as the CPU and the rest of the system.
The XE 3.46GHz is the class of the Pentium 4 processors, but the AMDs again do best here.
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least, because it’s designed to be a cross-platform application. POV-Ray also relies more heavily on x87 FPU instructions to do its work, and it contains only minor SIMD optimizations.
As expected, without many SSE, SSE2, or multithreading optimizations, the P4s don’t fare all that well.
Our power consumption results for the Athlon 64 processors were marred by that fact that the CPU voltage option on our Asus A8V Deluxe motherboard wasn’t actually functional. We could set the proper voltage for each CPU in the BIOS, but no matter what, both the AsusProbe monitoring software and CPU-Z reported 1.6V for the CPU. It’s possible this problem was caused by the revision 1008 beta 1 BIOS we were using, but that BIOS was necessary for compatibility with the 90nm Athlon 64. This problem was exacerbated by the fact that the A8V Deluxe is the only motherboard we had on hand that would POST with a 90nm Athlon 64 processor. As a result, all of the readings you’ll see below were taken with the CPU voltage at 1.6V (although we did set the proper value on the BIOS for each processor). Generally, 130nm Athlon 64s are supposed to run at 1.5V, and the 90nm flavors expect 1.4V. Take the results as you will, or ignore them if they offend your precise sensibilities.
The Pentium 4 is more complicated, because voltage specs for Prescott processors are set at the factory and may vary from one CPU to the next. The general expectation is about 1.4V, and we used the closest manual setting on our Abit AA8 DuraMax mobo, which was 1.3875V. The P4 Extreme Edition 3.4GHz ran at 1.575V, and we let the Intel mobo pick the default voltage for the P4 XE 3.46GHz.
We measured the power consumption of our entire test systems, except for the monitor, at the wall outlet using a Watts Up PRO watt meter. The test rigs were all equipped with OCZ PowerStream 470W power supply units. Ambient temps in Damage Labs were down in the low 70s, close to a sane room temperature. The idle results were measured at the Windows desktop, and we used Cinebench’s rendering test to load up the CPUs. For P4s, we used the multithreaded version of Cinebench to take advantage of Hyper-Threading.
The P4 XE 3.46GHz benefits from its P4 Northwood heritage; its power consumption at is lower than the Prescott chips, although not by much.
The overclocking function on the Intel mobo we used for testing was fairly limited. We couldn’t adjust CPU voltage, and it would only overclock the CPU by a maximum of 10%. I was able to coax another 180MHz out of the P4 XE, bringing it up to 3.64GHz. Unfortunately, I had to back way off on the memory timings, from 3-3-3-8 to 4-4-4-12, in order to get the system stable.
No dramatic gains here. However, a motherboard with better overclocking options might yield better results.
The Pentium 4 Extreme Edition 3.46GHz will list for $999. Our range of benchmarks shows that it’s a marginally better performer than its predecessor, the 3.4GHz XE with an 800MHz bus. In many cases, the fastest Pentium 4 of all is the P4 560 3.6GHz, which sells for well under half the price of the Extreme Edition 3.46GHz. And the undisputed overall performance king of the x86 CPU world is the Athlon 64 FX-55, which sells for an also ridiculous but comparatively cheaper $827.
In short, I don’t recommend buying this CPU. I’m pleased to see Intel going to extraordinary lengths to push the boundaries of performance at the high end of its desktop CPU line up. That is a very cool, very welcome development. Still, this processor isn’t a good deal by any stretch of the imagination, and it doesn’t really live up to its billing as the most “extreme performance” desktop CPU, especially for gaming. There’s really no way around that.
I’m also disappointed that the primary benefit of the P4 XE 3.46GHz, the move to a 1066MHz front-side bus, won’t trickle down to the rest of the Pentium 4 product line until some time next year. Usually, a new high-end CPU offers some benefit for the rest of the product line, either in terms of price or features. That’s not really the case here.
The 925XE chipset, on the other hand, could be a very interesting product. The chipset itself will list for $50, making it competitive with the 925X chipset. That’s reflected in the fact that the Intel 925XE motherboard will cost the same as the 925X board it replaces. I’m not sure why anyone would buy a 925X-based board once the 925XE arrives, unless Intel decides to limit the number of XE chipsets that it ships.
The real action with the 925XE chipset will happen if we can pair it up with a lower-end Pentium 4 Prescott chip like the P4 520 2.8GHz and overclock the stuffing out of the thing by cranking the bus up to 1066MHz. If such overclocking feats offer a reasonably decent success rate, the 925XE may become a very hot commodity in the right circles.