The result? A tie.
It’s been like this for months now, and I’m running out of ways to say, “Yeah, they’re pretty much tied, except sometimes one is much faster than the other, depending on what you want to do.” Last time out, we had more of the same at 2.4GHz versus 2100+.
Today, however, is different. While AMD struggles to deliver its first 0.13-micron version of the Athlon to the world, Intel is unleashing a new breed of Pentium 4 chips along with a new chipset to support them. These new processors will talk to the new chipset over a 533MHz front-side bus, which is, as we say in the industry, “really frickin’ fast.”
Also frickin’ fast is the newest high-end Pentium 4, which clocks in at 2.53GHz. So has Intel finally found the magic formula for defeating AMD’s finest? We’re about to find out.
The new chips
Intel is introducing three new Pentium 4 chips capable of running on a 533MHz bus. They’re clocked at 2.26GHz, 2.4GHz, and 2.53GHz. Because there’s already a 2.4GHz version of the Pentium 4 out there that runs on a 400MHz bus, the new chip at that speed will be designated “2.40B”. There’s really nothing new about these processors except their clock multipliers. They are Pentium 4 Northwood chips, and not much else is changed.
The new chipsets
Similarly, the new 850E chipset is very much like its predecessor, the 850 chipset. The big difference is that there “E” on the end of the name, which denotes “enhanced” or “extra bus speed” or maybe “extra nifty,” because the 850E supports a 533MHz front-side bus.
Moving from 400MHz to 533MHz increases peak bus bandwidth from 3.2GB/s to 4.2GB/s. To keep the processor fed on the other side of that bus, the 850E supports dual channels of PC800 RDRAM, which are good for up to 3.2GB/s. Although Intel claims the 850E isn’t yet officially validated for it, the 850E can also support two channels of PC1066 RDRAM, which should peak out at 4.2GB/s. Unfortunately, we weren’t able to obtain any PC1066 memory for testing just yet.
The 850E MCH chip is, like the 850 and the 845, paired with Intel’s ICH2 system I/O chip (generally known in these parts as a south bridge chip). It offers support for ATA-100 disk drives and most of the other usual suspects.
According to this report at DigiTimes, Intel has a couple more chipsets up its sleeve in the coming weeks, including the 845E, 845G, and 845GL. The 845E isyou guessed ita version of the 845 chipset with support for a 533MHz bus. The 845G is basically an 845E with integrated graphics, while the 845GL is the same thing with no external AGP slot. Of course, Intel doesn’t tend to comment on unreleased products, so we’ll have to wait and see whether DigiTimes is 100% accurate. Since both 845E and 845G boards are starting to show up at online vendors, I expect we’ll know soon.
Also showing up at online vendors are the new Pentium 4-based Celerons, which are reportedly Willamette chips with 128K of L2 cache. With the move to a 533MHz bus for the Pentium 4, Intel now has room in its lineup for this racy new Celeron. These Celerons will no doubt be paired up with the 845GL chipset and sold by the boatload in corporate desktop systems.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Athlon XP||Pentium 4 845||Pentium 4 850||Pentium 4 850E||Pentium 4
|Processor||AMD Athlon XP 2100+ 1.73GHz||Intel Pentium 4 2.4GHz||Intel Pentium 4 2.4GHz|| Intel Pentium 4 2.4GHz
Intel Pentium 4 2.53GHz
| Intel Pentium 4 2.4GHz
Intel Pentium 4 2.53GHz
|Front-side bus||266MHz (133MHz double-pumped)||400MHz (100MHz quad-pumped)||400MHz (100MHz quad-pumped)||533MHz (133MHz quad-pumped)||533MHz (133MHz quad-pumped)|
|Motherboard||Shuttle AK35GT2/R||Abit BD7-RAID||Intel D850MD||Intel D850EMV2||Abit SD7-533|
|Chipset||VIA KT333||Intel 845||Intel 850||Intel 850E||SiS 645|
|North bridge||VT8367||82845 MCH||82850 MCH||82850E MCH||SiS 645|
|South bridge||VT8233A||82801BA ICH2||82801BA ICH2||82801BA ICH2||SiS 961|
|Chipset drivers||VIA 4-in-1
|Intel Application Accelerator 6.22||Intel Application Accelerator 6.22||Intel Application Accelerator 6.22||N/A|
|Memory size||512MB (2 DIMMs)||512MB (2 DIMMs)||512MB (4 RIMMs)||512MB (4 RIMMs)||512MB (2 DIMMs)|
|Memory type||Micron PC2700 DDR SDRAM||Micron PC2100 DDR SDRAM||Samsung PC800 Rambus DRAM||Samsung PC800 Rambus DRAM||Micron PC2700 DDR SDRAM|
|Graphics||NVIDIA GeForce4 Ti 4600 128MB (Detonator XP 28.32 video drivers)|
|Sound||Creative SoundBlaster Live!|
|Storage||Maxtor DiamondMax Plus D740X 7200RPM ATA/100 hard drive|
|OS||Microsoft Windows XP Professional|
Note that we’ve included an SiS 645-based system so we can get a sense for the performance of the P4s with a 533MHz bus and DDR333 memory. Although the SiS 645 chipset doesn’t officially support a 533MHz bus, Abit’s wondrous (and aptly named) little SD7-533 motherboard offers all the right divisors for PCI, AGP, and memory to allow operation with a 533MHz front-side bus without running anything else out of spec. In fact, the board performs flawlessly with the bus at 533MHz. SiS has recently released its 645DX chipset with official 533MHz bus support and a slightly improved memory controller, and we’ll have one in-house for testing soon. Nevertheless, what you see from the SD7-533 shouldn’t be far off what you can expect from other DDR333 solutions for these new Pentium 4 chips.
I should also note that we’re using the Intel Application Accelerator drivers instead of the older Ultra ATA drivers. We elected to go this route because Intel is replacing its Ultra ATA drivers with IAA. In addition to providing support for Ultra ATA modes, the Application Acclerator does some prefetching to improve I/O throughput, so products based on Intel chipsets may have a slight advantage as a result. But then, that’s the point. We’re hopeful other chipset manufacturers will incorporate similar performance-boosting measures in their drivers, as wellif they haven’t already.
The test systems’ Windows desktops were set at 1024×768 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- SiSoft Sandra Standard 2002
- Compiled binary of C Linpack port from Ace’s Hardware
- ZD Media Business Winstone 2001 1.0.2
- ZD Media Content Creation Winstone 2002 1.0
- POV-Ray for Windows version 3.5 beta RC3
- NewTek Lightwave 7.0b
- Sphinx 3.3
- ScienceMark 1.0
- LAME 3.91
- Xmpeg 4.5 with DivX Video 5.01
- MadOnion 3DMark 2001 SE
- Codecreatures Benchmark Pro
- Comanche 4 demo benchmark
- Serious Sam SE v1.05
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
Since the big change we’re examining today is a faster front-side bus, memory performance is key. SiSoft Sandra’s synthetic memory bandwidth tests will give us a peek at how effective the higher bus speeds are in delivering more throughput.
Sure enough, the 533MHz bus delivers a good chunk more bandwidth than the “old” 400MHz bus, especially with dual-channel RDRAM. Although the 400MHz bus theoretically was fast enough to accommodate 3.2GB/s, in practice, it was a bottleneck.
Speaking of bottlenecks, the Athlon XP 2100+ delivers almost no more memory bandwidth here, with DDR333 memory on a KT333 chipset, than it did last time out, when we used DDR266 on a KT266A. The limiting factor is clear: The Athlon XP’s 266MHz bus. The Athlon XP’s bus is now effectively half the speed of Pentium 4’s new 533MHz bus (although AMD’s bus is probably a little more efficient). Unless and until AMD raises the speed of the Athlon XP’s front-side bus, the Athlon XP will be at a disadvantage in scenarios where memory or bus bandwidth is critical.
We’ll use Linpack to illustrate memory performance with a little more precision. Check out the funky graph:
You can see here how the L1 and L2 caches of the processors help performance when handling small data matrices, but once we get past about 512K, main memory access is the name of the game. The Athlon XP is fastest when it’s crunching numbers stored in its 64K L1 cache. After that, though, it’s all Pentium 4. The P4’s big, fast L2 cache delivers more peak performance than the Athlon XP, and the P4 never relinquishes the lead.
Now we get to see what happens when the rubber meets the road. Does the 533MHz bus really help in pedestrian tasks like running everyday business applications?
The 850E chipset brings a slight but measurable performance increase over the 850 and 845, even in Business Winstone. The SiS 645 turns in an outright lousy performance here, but low performance in Business Winstone is a quirk we’ve come to expect in the SiS 645; it doesn’t seem to rear its head in other tests.
As for the Athlon XP, it’s near the bottom of the pack.
Content Creation Winstone
We’re using the newest version of CC Winstone, version 2002, but not without some reservations, which we expressed here. Ideally, we’d have time to run both the 2001 and 2002 versions of CC Winstone to provide a little perspective. Ideally, we’d be retired in the Cayman Islands right now, too, but it ain’t happening.
The Pentium 4 systems are bunched up pretty tightly here. The systems with 533MHz FSBs take the top four spots, but only by a hair. The Athlon XP, meanwhile, just can’t keep up.
POV-Ray 3D rendering
This time around, we decided to try out a beta version (RC3) of the new POV-Ray 3.5. For continuity’s sake, however, we’re rendering the same test scene we’ve used with POV-Ray 3.1 for some time now.
Even with an 800MHz clock speed advantage, the Pentium 4 can’t keep pace with the Athlon XP. The Athlon XP’s underlying computational abilities are formidable, and 3D rendering of this sort doesn’t really tend to stress things like bus bandwidth.
Interestingly enough, on both processor architectures, the 3.5 beta version of POV-Ray is much faster than version 3.1, but it’s slower than the recompiled version of 3.1 we used in previous tests.
Lightwave 3D rendering
To keep things interesting, we’ve added NewTek’s Lightwave to our test suite. Lightwave is used by Hollywood animation studios and the like, and it offers a unique chance to test 3D rendering speed in an application that’s optimized for SSE2. In fact, NewTek released Lightwave version 7.0b concurrently for PCs and Macs, offering SSE2 optimizations on the Pentium 4 and AltiVec optimizations on the G4.
With code specifically optimized for the Pentium 4, Lightwave runs much faster on the P4 than on the Athlon XP. Note that vectorizing code to take advantage of SIMD instruction sets isn’t as simple as using a new compiler; there’s a reason NewTek released SSE2- and Altivec-enhanced versions of Lightwave at the same time. Still, once the optimizations are in place, the performance gains are notable.
Lightwave is the only case I’ve seen where the Pentium 4 could beat the Athlon XP in 3D rendering performance. The SSE2 optimizations help, no doubt, but the P4’s margin of victory is just gaudy. And I’m not clear on some of the particulars here. Why didn’t NewTek use 3DNow! or SSE if it’s available, instead of supporting only SSE2? (I’m sure hordes of knowing readers are preparing an e-mail deluge for me on that question already. Pre-emptively, let me say: I’ve not been able to find documentation of AltiVec’s ability to handle double-precision floating-point calculations like SSE2, yet they used Altivec.)
Also, some folks have reported slowdowns with Athlon XPs when moving from 7.0 to 7.0b code. Update: The test results in this article, which seem to have been the foundation for those claims, show only minor slowdowns with the Athlon XP on a few tests and faster render times on others. It is true that the Athlon XP turned in slower times here with 7.0 than with 7.0b in both of the benchmarks that we used above (raytrace and reflective radiosity). However, the actual time increases were very small–only a few seconds. Nothing to fret over.
Finally, just as we finished our testing, we found out NewTek had recently released version 7.5 of Lightwave. We’ll try the newer release next time around and see if that has any effect on relative performance.
LAME MP3 encoding
Our previous LAME test setup was simply being run over by high-speed CPUs; they were crunching through an entire 50MB audio file in about 20 seconds, with only fractions of a second separating the fastest times. So this time around, we’ve beefed things up by using a 101MB source audio file and asking LAME to encode a high-quality variable bit rate MP3. The exact command-line options we used were:
lame -v -b 128 -q 1 file.wav file.mp3
This encoding task produced the following results:
For high-quality VBR encoding, the Pentium 4 is fastest. MP3 encoding with LAME doesn’t rely heavily on bus or memory bandwidth; there’s very little difference between a 400MHz and 533MHz FSB or between different memory types. Any way you cut it, though, the Pentium 4 leads here, and the 2.53GHz version is fastest of all.
DivX video encoding
We’ve finally decided to complement our audio encoding tests with a video encoding test. Xmpeg can encode video files using the popular DivX format, which produces very high quality video in relatively small amounts of space. For this test, we took a 279MB video file, encoded in MPEG2 format at DVD quality, and converted it to a 37MB DivX file.
Xmpeg supports all the various x86 SIMD instruction sets, including MMX, 3DNow!, SSE, SSE2even various flavors of 3DNow!, like 3DNow! Enhanced. Most importantly, perhaps, Xmpeg makes good use of the Pentium 4’s SSE2 instruction set, which offers potentially higher performance than the SSE or 3DNow! instructions supported by the Athlon XP.
With the aid of SSE2 and gobs of memory bandwidth, the Pentium 4 wins this one handily. If you plan on doing video editing work, by all means, consider a Pentium 4 system.
Codecreatures Benchmark Pro
Here’s another new addition to our test suite. The Codecreatures benchmark builds and renders scenes with an absolutely insane number of polygons in real time. It’s one of the best looking things we’ve seen in real-time graphics on a PC.
AMD’s supposedly outmatched processor pulls a win out this time around. Obviously, Codecreatures leans pretty heavily on the video card, because the performance differences here are miniscule. Still, Codecreatures is quite likely a good indicator of graphics performance in future games, and as such, it bodes well for the Athlon XP.
3DMark 2001 SE
3DMark scores scale up very gradually as processor speeds and bus speeds increase. Once again, the faster bus speeds provide a slight but real performance advantage.
Serious Sam SE
Serious Sam has always favored Athlons over Pentium 4suntil now. The 2.53GHz systems take the lead.
Comanche 4 is a true DirectX 8 game that makes use of our graphics card’s pixel and vertex shaders. It’s also one heck of a punishing load on a CPU.
As in most of the other tests, the combination of the 850E chipset and the Pentium 4 2.53GHz can’t be beat. Notably, though, there’s very little difference in performance between the Pentium 4 2.4GHz on the 850E chipset and the same speed P4, with a slower bus, on the 850 chipset.
The Sphinx speech recognition software has been fun for us to test over the past months because we’ve been dancing around on the border between real-time and less-than-real-time speech recognition. We’ve been waiting for the P4 with a 533MHz bus to see if we couldn’t decidedly shatter that barrier once and for all. And the verdict?
Not a problem! All four of our P4 systems with the faster bus speed easily process speech at rates faster than real time, regardless of which compiler was used to generate the Sphinx executable.
The next big barrier is 0.8 times real time, at which point Sphinx will execute fast enough to allow real-time, high-quality speech recognition applications to work really well.
We’ll finish up our testing with ScienceMark, which measures performance in several real-world scientific computing scenarios.
ScienceMark has always been the domain of the Athlon, and even now, the AMD processor leads the field. Some of ScienceMark’s individual tests tell an intriguing story here.
The Athlon XP is far and away fastest on the Liquid Argon routine, over 12 seconds faster than the nearest Pentium 4.
Primordia, however, likes fast memory subsystems. The Pentium 4 dominates here.
QMC is again faster on the Athlon XP.
There’s no reason to mince words about this one. Intel has clearly and unambiguously taken the performance lead away from AMD now. The faster front-side bus speeds are helpful, as is the clock speed boost to 2.53GHz. Combined, the Pentium 4 has enough extra performance to beat out the Athlon XP 2100+ in all but a few of our tests.
For the 850E chipset and its RDRAM memory, the future is a little murky. Even this move to a very fast bus hasn’t made DDR memory much less competitive, as RDRAM advocates have claimed would happen over time. Our overclocked SiS 645 motherboard put in a very respectable showing against the 850E. PC1066 RDRAM, when it becomes available, will give the 850E a boost. But as Intel, SiS, VIA, and others prepare official DDR333 and even dual-channel DDR chipsets for the 533MHz-bus Pentium 4 chips, the 850E will have a real fight on its hands.
What’s ominous for AMD is the way Intel has positioned itself with the Pentium 4. Right now, the move from a 400MHz to 533MHz bus looks kind of underwhelming in most of our tests. But the faster bus speed gives the Pentium 4 loads of headroom for the future; all Intel has to do is turn up the clock now. Given the overclocking successes we’ve seen with Pentium 4 Northwood chips, there’s no reason to believe Intel can’t be at 3GHz by fall. Heck, they could probably get there sooner if they really pushed.
But there’s no reason to push, because it doesn’t seem likely to me that clock speed increases alonein the form of the forthcoming Thoroughbred chipwill deliver the performance crown back into the hands of AMD. AMD will have to come up with somethinga larger cache, a faster bus, architectural improvementsin order to keep pace. It might well be early next year, when AMD’s Hammer chip finally bows, before AMD challenges Intel for the outright x86 performance lead once again.
Then again, maybe none of that matters to AMD. From a price-performance standpoint, the Athlon is still far and away the leader. Intel’s pricing for the new Pentium 4 chips looks like so:
Pentium 4 2.53GHz: $637
Pentium 4 2.40B GHz: $562
Pentium 4 2.26: $423
For $637, you can put together the core of a nice Athlon XP-based system with a motherboard, memory, and a graphics card. (We’re talking Athlon XP 2100+ processor, KT266A motherboard, 512MB PC2100 memory, and a GeForce4 Ti 4200 graphics card for that price.) Yeowch!
At the end of the day, having the very fastest comes with a steep price. And the Pentium 4 2.53GHz on a 533MHz bus is definitely the very fastest x86 processor you can buy.