After some investigation, Intel pulled the 1.13GHz Pentium III back from the market, and a long hiatus ensued. The Pentium III languished at gigahertz and lower clock speeds while the company concentrated on its new, high-clock-speed burner, the Pentium 4. Equipped with a brand-new NetBurst microarchitecture, oodles of platform bandwidth, and the power of a whole new digit after its name, the P4 was Intel’s best tool to combat AMD’s advancing Athlon.
The P4 was designed to run at extremely high clock speeds, and it’s already reached 1.8GHz when built on the same fabrication process as the original, stillborn 1.13GHz Pentium III. Meanwhile, AMD’s assault has continued. The Athlon has added faster DDR memory and a 266MHz bus, and AMD has pushed it to 1.4GHz. At that speed, it was the highest performance desktop PC processor available last we checked. Even the mighty Pentium 4, with all those megahertz available to it, couldn’t best the Athlon.
All of that may turn around soon, however. The Athlon is a great design, but one could argue the processor industry is primarily about manufacturing. On the manufacturing front, Intel has just taken a decisive lead, and the processor we’re reviewing here today is the first evidence of that fact.
Things really do shrink with age
Intel has managed to push the Pentium III design past the 1.13GHz barrier by manufacturing the chip on its brand-new fabrication process. Previous “Coppermine” PIIIs were built on Intel’s 0.18-micron fab process. (The “micron” designation refers to the width of features on the chip. If 0.18 microns sounds extremely small, you’re getting the idea. This is how they pack millions of transistors into a thumbnail-sized area.) This new Pentium III, code-named Tualatin during its development, is built on a brand-new, 0.13-micron fab process. This new process is smaller, but that’s not all. Intel is now using copper interconnects instead of aluminum, and they’re using low-capacitance dielectrics, as well. (Despite the name, Coppermine chips used aluminum interconnects.)
All of these things will allow chips to run more efficiently, cooler, with less power, and at higher clock rates. The Pentium III 1.2GHz requires only 1.475V core voltage, and it’s a model of stability. We didn’t experience a single crash during our testing.
Beyond the die shrink, the new Pentium III isn’t radically changed from its previous renditions. The desktop version has the same L2 cache configuration as the Coppermine, but the mobile and server versions have twice the cache512Konboard. Beyond that, the only big change that might affect performance is a new data prefetch logic. Like the P4 and AMD’s new “Palomino” Athlons, the new PIII will try to anticipate what data a program will need next and pre-load it into the chip’s L2 cache. In certain situationsespecially on the chips with 512K L2 cachesdata prefetch could improve performance. Unfortunately, the PIII is still saddled with a 133MHz front-side bus and PC133 memory, so don’t expect huge gains.
Cosmetically, the new PIIIs look a little different than previous PIIIs because they’re sporting an Integrated Heat Spreader (IHS) a la the Pentium 4. For the terminology-weary among us, IHS is just a fancy name for “metal cap.” The IHS covers the chip and, presumably, spreads heat.
There is a hole in the IHS, you may have noticed. An Intel rep once told me that hole is there to allow gas to escape as the newly-made chip cures. Either that’s true, or he’s been chuckling at my sadly mistaken notion of a farting chip. Whatever the case, the hole doesn’t present any problems; if you get a little thermal paste down there, it’s no big deal.
The new PIIIs fit into good ol’ Socket 370 sockets, like so many PIIIs before them. However, because of clocking, voltage, and signal level differences, Intel doesn’t recommend using these chips in older motherboards. Our test system included a B-step revision of Intel’s 815 chipset, which is able to support the new PIIIs properly. Intel bills the motherboard, the catchily-named D815EEA2, as a universal Socket 370 platform, because it’s able to accept anything from the oldest socketed Celerons to the newest Tualatin PIIIs.
This new Pentium III faces some formidable competition, especially on the desktop, where it’s got to square off with a couple of newer, sexier designs: the Athlon and Pentium 4. In order to test the new PIII’s mettle, we threw it into the ring with the latest and greatest, including a 1.4GHz Athlon and a 1.8GHz Pentium 4. To be fair, we also tested against a 1.2GHz Athlon and against the P4 at 1.4 and 1.6GHz.
Before you count the PIII out, however, keep a couple of things in mind. First and foremost, remember this:
Clock speed ain’t everything.
If it were, the Pentium 4 1.8GHz would be the undisputed winner. Everybody else would just have to pack up and go home. But the Pentium 4’s deeply pipelined design can’t deliver the same performance, clock for clock, that the Athlon or Pentium III can. In fact, in our previous tests, the 1.7GHz Pentium 4 ran neck-and-neck with the 1.2GHz Athlon. Now that doesn’t make the Pentium 4 a bad design. It’s just that the P4 executes fewer instructions per clock (IPC) than the other chips here. The Athlon and PIII, by contrast, are pretty evenly matched clock for clock.
On top of all that, the PIII 1.2GHz is the only chip here made on a 0.13-micron process. The other two are both 0.18-micron chips. The new PIII is smaller, runs cooler, and requires less power.
However, the newer processor designs here do have compelling advantages. To better understand the differences between these different CPUs and their platforms, I suggest you read our reviews of the Pentium III Coppermine, AMD 760 chipset with DDR SDRAM, and the Pentium 4. These newer processors aren’t just more advanced designs; they sit in more advanced platforms. They ride on much faster front-side busses, from 266MHz for the Athlon to 400MHz for the P4. And they use faster memory2.1GB/s DDR SDRAM for the Athlon, and exotic 3.2GB/s RDRAM for the Pentium 4. The Pentium III’s pokey 133MHz bus and 1.06GB/s memory seems almost quaint anymore.
So it won’t be easy for the new PIII. Still, the proof’s in the pudding, so let’s get on to the benchmarks.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged. The Pentium III test system was built using the following components:
Processor: Intel Pentium III processor at 1.2GHz on a 133MHz bus
Motherboard: Intel D815EEA2 motherboard – Intel 815 chipset – 82815 memory controller hub (MCH), 82801BA I/O controller hub (ICH2)
Memory: 256MB PC133 DDR SDRAM (CAS 3) in two 128MB DIMMs
Video: NVIDIA GeForce 3 64MB AGP (Version 12.41 drivers)
Audio: Creative SoundBlaster Live!
Storage: IBM 75GXP 30.5GB 7200RPM ATA/100 hard drive
Our comparison systems varied only with respect to the motherboard, memory, and CPU. The Athlon system was built with these parts:
Processor: AMD Athlon processors – 1.2GHz and 1.4GHz on a 266MHz (DDR) bus
Motherboard: Gigabyte GA7-DX motherboard – AMD 761 north bridge, VIA VT82C686B south bridge
Memory: 256MB PC2100 DDR SDRAM (CAS 2.5) in one 256MB DIMM
The Pentium 4 box looked like this:
Processor: Intel Pentium 4 processors – 1.4GHz, 1.6GHz, and 1.8GHz and a 400MHz (quad-pumped) bus
Motherboard: Intel D850GB – Intel 850 chipset – 82850 memory controller hub (MCH), 82801BA I/O controller hub (ICH2)
Memory: 256MB PC800 DRDRAM memory in two 128MB RIMMs
All systems were equipped with Windows 2000 SP2 with DirectX 8.0a.
We used the following versions of our test applications:
- SiSoft Sandra Standard 2001.3.7.50
- Compiled binary of C Linpack port from Ace’s Hardware
- ZD Media Business Winstone 2001 1.0.1
- ZD Media Content Creation Winstone 2001 1.0.1
- LAME 3.70
- SPECviewperf 6.1.2
- POV-Ray for Windows version 3.1g (multiple compiles)
- 3DMark 2001 Build 200
- Quake III Arena 1.17
- Serious Sam v1.02
- ScienceMark 1.0
- Sphinx 3.3
The test systems’ Windows desktops were set at 1024×768 in 32-bit color at a 75Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
To highlight the biggest difference between the Pentium III and its younger competitors, we’ll start off with memory bandwidth tests. (Well, we pretty much always start off with those, but that’s my excuse this time ’round.) The modified Stream test in SiSoft’s Sandra does a nice job showing the peak bandwidth potential of each of these platforms. Here’s how they stack up.
As you can see, the Pentium III is at a distinct disadvantage here. It delivers about half of the memory bandwidth of the Athlon DDR, and about a fifth of what the Pentium 4’s dual Rambus channels manage to crank out. Now memory bandwidth isn’t everything. In fact, its impact on real-world performance isn’t hugethe Athlon gained between five to twenty percent with the move to DDR memory, and generally closer to five. Also, bandwidth is only one measure of memory performance. Memory latency is probably even more important to real-world performance (though the two things are related). Although harder to measure than bandwidth, the PIII’s SDRAM probably has lower latencies than our other two contendersespecially the P4’s RDRAM.
Memory performance also interrelates with L1 and L2 cache performance. Let’s look at this another way, so we can pull all of those elements together. The snazzy-looking Linpack graph will do that neatly.
This graph’s not always easy to read. For simplicity’s sake, I’ve omitted the Pentium 4 1.6GHz from this one. For those of you who don’t know how to read a Linpack graph, I’ve shamelessly stolen my explanation from a previous review and put it in here:
Linpack performs floating-point operations on a range of data matrices, and the resulting line graph shows the strengths and weaknesses of each processor. The Athlon shows its floating-point prowess by offering the highest peak performance with a relatively small data set. But once we reach about 192K, the Pentium 4 has a pronounced lead. Its 256-bit data path to its L2 cache, combined with a very smart L2 cache controller, helps put the P4 on top. Note that the Pentium III, which also has a 256-bit L2 cache interface, has a similarly shaped curve, and peaks at about the same place as the P4.
Both Intel processors start to drop off sharply at about 256K, while the Athlon hangs on until it reaches about 320K. Here you can see the Athlon’s exclusive L2 and L1 caches working together. Because the Athlon’s L2 cache doesn’t replicate the contents of its 64K L1 data cache, its total effective cache size is larger than either of the Intel processors. (The Athlon also has a 64K L1 instruction cache.)
Once we get to those sharp, downward curves, we’re accessing main memory to perform the calculations. And once that happens, the Pentium 4’s fast front-side bus and dual RDRAM channels kick into high gear. The Pentium 4 delivers well over twice the sustained performance of the DDR SDRAM-based Athlon system with larger data sets, and it crushes the Pentium III, as well. A very impressive showing.
The Pentium III’s L2 cache is very fast. At about 192K, it peaks out faster than the L2 caches on both the 1.4GHz Pentium 4 and the 1.2GHz Athlon. However, once we’re out to main memory, we see the same pattern the Stream tests showed us above: the PIII is much slower going to main memory than its competitors.
I should also mention that our PIII test system was crippled by the fact that I’m cheap. (If you don’t believe me, check this out. I’m rather sad, really.) We used low-brow generic CAS 3 memory instead of widely available, and faster, CAS 2 RAM, because that’s what I had on hand. Some of the newer PC133 SDRAM DIMMs I’ve bought recently might have delivered better performance, but they were high-density chips, and the 815 chipset doesn’t take kindly to them. It’s also likely this Intel motherboard uses rather conservative timings, sacrificing some performance for stability. We wouldn’t be shocked to see memory bandwidth scores another 100-150MB/s faster for a properly tuned Pentium III system.
That said, memory bandwidth will have an uneven, and sometimes very minor, impact on overall performance.
Business Winstone 2001
Business Winstone tests common office-suite applications, and will give us our first real indication of how the new PIII performs.
Like I said, don’t count the PIII out. The 1.2GHz PIII whups up on the 1.2GHz Athlon, and absolutely embarrasses the Pentium 4, besting even the 1.8GHz chip.
Content Creation Winstone 2001
Content Creation Winstone is a little heavier on the kinds of apps that are more performance-sensitive than MS Wordimage and audio editing software and the like.
Here, the PIII falls near the bottom of the pack, showing its age a little. It’s still a very respectable performance, all things considered, but not the coup that we saw on the Business test. The PIII’s relative weakness in more multimedia-intensive apps could be a hint of things to come.
POV-Ray 3D rendering
POV-Ray is a freeware software ray-tracing program that creates high-quality 3D scenes. It’s also a very useful measure of a processor’s performance, particularly on floating-point math. Our POV-Ray tests use the original release of POV-Ray 3.1, plus Steve Schmitt’s recompiled versions, just to see what difference the various compilers and compiler settings can make.
The recompiled POV-Ray comes in two flavors: “PIII” and “P4”. Both were produced with Intel C v. 5.0. The “PIII” version doesn’t use any instructions proprietary to Intel processors or to the PIII; it runs just fine on the Athlon and the P4. The “P4” version uses a small bit of SSE2 code, but it doesn’t take advantage of the P4’s SIMD capabilities. I’ve indicated which version of POV-Ray was used in the graphs below next to the processor/speed labels, so it should be easy to track.
Also, because the graphs were getting big enough already, I’ve again omitted results for the 1.6GHz Pentium 4.
With optimizations, the 1.4GHz Athlon is the overall leader, especially in the compute-intensive chess2 test. All of the processors benefit from optimizations, but the Pentium 4 most of all. Putting it another way, one could say the P4 performs relatively weakest with legacy code. The PIII 1.2GHz can’t get much higher than middle of the pack, even with optimizations. However, it splits the difference with the 1.8GHz Pentium 4 when using the original version of POV-Ray. In all, the PIII’s floating-point performance is still impressive, but the newer designs have it beat.
LAME MP3 encoding
LAME is the encoder of choice around Damage Labs for high-quality output, so this test holds some interest for me. More speed for MP3 encoding is always good. However, to keep it fair, we’ve avoided the newer builds of LAME that incorporate support for the Athlon’s 3DNow! instructions.
Here again, the PIII is a little weak in a multimedia-related task. Unlike some other multimedia-oriented tasks, in our experience, this test doesn’t benefit much at all from extra memory bandwidth; it’s almost entirely CPU-bound.
Quake III Arena
Now we enter the 3D gaming realm, which isn’t likely to be kind to the PIII. Memory and bus bandwidth are the orders of the day here, as are driver optimizations for SIMD extensions like SSE, 3DNow!, and SSE2. Can the PIII pull out a surprise win in Quake III?
Not a chance. It’s safe to say the older PIII and its platform are simply outclassed here. Quake III is notoriously bandwidth-hungry.
Let’s try another OpenGL-based first person shooter for good measure. Serious Sam allows us to plot performance over time, so we can see how the different processors handle different portions of the game demo we’re timing. In this case, we’ve used five-second intervals. The end result looks like so:
You’ll notice immediately how the lines for the PIII and Athlon are shaped similarly, while the P4 follows its own path. The bigger dips in the P4’s lines are almost to be expected from a CPU with such deep pipelines; the P4’s performance is rather peaky in character.
The 1.2GHz PIII performs a little better overall than the 1.4GHz P4, but it’s slower than everything else. The Athlons are definitely the winners here.
Now let’s look at DirectX 8.0 3D gaming, where the Pentium III will spend a lot of time, since it’s the CPU (at 733MHz or so) going into Microsoft’s Xbox, along with a graphics chip very similar to our test systems’ GeForce3.
Once again, the PIII is outclassed, although it did manage to break the 5000 mark, which isn’t too bad. I’d have to say Intel stole the Xbox contract away from AMD on price, not performance, though.
Regular readers may notice that the Pentium 4 won back the 3DMark crown from AMD, just weeks after AMD captured it from Intel. NVIDIA’s new Pentium 4-optimized 12.41 video drivers are the reason why the P4 was able to take back the crown.
SPECviewperf workstation graphics
Now we’ll look at 3D graphics of a different type: 3D modeling, medical CAD, and other professional apps. Our GeForce3 card will handle a lot of the work here with its on-chip transform and lighting engine, but CPUs still matter…
Yet again, the PIII is the slowest of the bunch. In some tests, like ProCDRS, the video card limits performance absolutely, but where CPU performance matters most, in the DX and Light tests, the Pentium III fares the worst. These days, most new high-end workstations will be based on dual-processor Athlon or Xeon configurations, and that’s probably best. Even if dual Pentium III 1.2GHz workstations do become available, they won’t be the most compelling option.
These speech recognition tests are a new addition to our test suite. They’re based on Sphinx 3.3, which came to me via Ricky Houghton, who works in the speech recognition effort at Carnegie Mellon University. I’ll let him explain Sphinx and his hypothesis about its performance bottlenecks:
Sphinx is our speech recognition that CMU has been developing over the last 30 years. (Really! Many speech folks came out of here. Janet and Jim Baker, founders of Dragon for instance are from here. Several recognizers are based on the CMU system, this includes the Microsoft system, the apple system and even the Kurtzweil system now owned by L&H.)
We have two systems, both open source: Sphinx 2 and Sphinx 3. Sphinx 2 is a semi-continuous HMM based system that runs in less than real time on a reasonable machine (PIII 500Mhz, 512MB ram) Sphinx 3 is running at 1.6 to 1.8 times real-time on a PIII 933Mhz machine with CAS2 133Mhz SDRAM. I ran it on a PIII 866 MHz with RDRAM and saw the system run at about 1.2 times real-time. It turns out that Sphinx 3 is memory limited, an increase in CPU speed results in very little improvements in speed, increases in memory bandwidth result in sizeable improvements in speed.
You can find source for Sphinx at SourceForge.org.
Let’s see if his hypothesis holds true.
Well, not exactly. The 1.2GHz Pentium III, definitely at the bottom rung of the memory bandwidth ladder, beats both Athlons here. The P4 does do well, but not so dramatically as one might hope.
Incidentally, what we’re looking for here is a system to execute the speech recognition routine at a rate faster than real time. Doing so would open up new opportunities for low-cost, high-accuracy speech recognition systems. We’ll keep coming back to this one. One way or another, I expect we’ll see that happen in the next six months.
Now for Tim Wilkens’ impressive computational benchmark, ScienceMark. This time around, we’ll use only the original version of ScienceMark compiled in Compaq Visual Fortran. (If you’d like to see results from other versions, go here.) This suite of tests measures computational ability by running some well-known (in the right circles) scientific equations. Like 3DMark, ScienceMark then spits out a composite number denoting a system’s overall score in the suite.
Here’s how our contenders fared:
The PIII falls into last place, but look at some of the individual tests before passing judgment.
These three tests aren’t the only things ScienceMark measures, but they are some of the more important computational tests in the suite. Notice that the PIII performs relatively well on two of the three tests. Only in Primordia does it fall behind. If ScienceMark’s overall scores were weighted a bit differently, the PIII would be middle of the pack, not dead last.
All in all, the 1.2GHz Pentium III is an interesting piece of technology, but this processor is clearly beginning to show its age. The test results speak for themselves. The PIII 1.2GHz is still in the running with the gigahertz-plus competition, but in 3D graphics and other multimedia-type tasks, the newer processorsand their newer platformsrule the roost.
That’s one of the reasons why Intel is pushing the Pentium 4 on the desktop, and relegating the PIII to the mobile and low-profile server markets, where it’s likely to find a healthy niche. However, that’s not the whole story. Both the mobile and server versions have 512K L2 caches. (The server version is called Pentium III-S, and the mobile version will include Intel’s power-saving technologies.) I suspect Intel gave the desktop version only 256K of L2 cache memory in order to keep from upsetting its desktop strategy with the Pentium 4. With 512K L2 cache and data prefetch, the Tualatin PIIIs might embarrass their bigger brothers a little.
However, I have no problem with Intel’s strategy here. The extra L2 cache won’t mask the Pentium 4 platform’s fundamental superiority, and if consumers and OEMs need a little bump in the right direction, that seems appropriate to me. The new Pentium III will probably be sent to do battle with the AMD Duron in the value market, and combined with the 815 chipset, it’s likely to do very well there. It’ll make a heck of a corporate desktop machine. And it will need that data prefetch logicwhich didn’t seem to do much for the chip’s performance in our teststo match features with the new “Morgan” core Durons coming from AMD. Built on Intel’s 0.13-micron fab process, quite a few Tualatins ought to fit on a single wafer, so they should be cheap to make, as well.
Intel is releasing two versions of the desktop Tualatin Pentium III, one at 1.2GHz and another at 1.13GHz. However, Intel recently began selling a revised 0.18-micron Pentium III that runs at 1.13GHz. To avoid confusion, the 0.13-micron version will be called the Pentium III 1.13A processorshades of the old Celeron 300A. Initially, the 1.2GHz version will be priced at $294, and the 1.13A version at $268.
In the server market, a chip that runs this cool will no doubt be a popular choice in low-profile servers. (And it does run cool. After running a Quake III botmatch for hours, the heatsink is barely above room temperature.) The extra L2 cache and data prefetch are perfect additions for SMP systems, too. Likewise, this chip should do well in the mobile market, where its low power consumption will be a big help, and its relatively low multimedia performance won’t be noticed.
The big story here from my point of view, however, is Intel’s 0.13-micron fabrication process. In this latest round of tests, we’ve seen the 1.8GHz Pentium 4 pull neck-and-neck with the Athlon. The P4-Athlon feud is now a see-saw battle, though the Athlon wins more than it loses. The 1.2GHz Pentium III proves Intel is capable of producing chips in volume on this new process. Once the Pentium 4 makes the transition to 0.13 microns, we’re going to be privy to some very serious speed.