Intel’s processors have benefited greatly from the die shrink that happened back in January when the “Northwood” P4s first hit the scene. Built using Intel’s 130nm manufacturing process, Northwood runs faster and cooler than the original “Willamette” Pentium 4, even as it packs in 512K cachedouble the amount in its predecessor. Since the Northwood arrived, processor watchers have been waiting impatiently for AMD to counter with a 130nm CPU of its own, a version of Athlon XP code-named “Thoroughbred.” It’s taken some time, but T-bred is finally here, running at 1.8GHz and given an AMD model number of 2200+.
But is T-bred swift enough to catch one of Intel’s 2.4 or 2.53GHz burners? Let’s take a closer look.
Let’s dispel the rumors right now. Thoroughbred is simply a die shrink of the Athlon XP. Nothing more. Well, OK, not much more. AMD says T-bred’s transistor count is 37.2 million, down a smidgen from the previous “Palomino” version of the Athlon XP, due to a more efficient layout and “lower voltage handling requirements.” That explains why the T-bred is a different shape than Palomino, too. But T-bred doesn’t include any notable performance tweaks like more cache, SSE2 instructions, or other sorts of engineering magic. The die shrink means the Athlon XP ought to be able to reach much higher clock speeds in the future, but clock for clock, a T-bred ought to perform exactly like a Palomino.
Not only that, but AMD is not yetif everraising the Athlon XP’s front-side bus speed from its present speed of 266MHz. So Athlon XPsof any flavorwill be hard pressed to take full advantage of advances in memory performance like DDR333 or dual-bank memory controllers.
Of course, none of these things mean, all by themselves, that the T-bred won’t be a screamer. The Athlon XP has scaled up fairly linearly in performance to date, and only the benchmarks will tell whether how well T-bred fares in that department.
We’ll get to benchmarks in a moment, but let’s consider T-bred’s other virtues. As you might expect with a die shrink, T-bred should be smaller, run cooler, and suck up less voltage than the Palomino. The 2200+ version requires only 1.65V, while the 1700+ needs just 1.5V. (All Athlon XP models, from 1700+ up, will transition to the T-bred core.) In fact, AMD allocated its first, limited supplies of production T-breds to computer makers for use in laptop PCs.
But the size difference is T-bred’s most striking attribute. Have a look at the difference between a Pally and a T-bred:
T-bred is downright teeny. To my eye, it’s nearly half the size of the Palomino. The shrink from 180nm to 130mn is major. Officially, T-bred is 80 mm2, while Palomino is 128 mm2. By contrast the Pentium 4 is absolutely mammoth. Early Northwoods packed all 55 million of their transistors into a space 145 mm2, while ongoing process tweaks have cut the size on newer chips down to 131 mm2, according to reports.
(Also, in case you’re wondering, AMD hasn’t abandoned its plans to move its CPU packages from brown, like you see here, to green, like you can see here. Apparently the color change is just taking some time, and the Athlon XP 2200+ sample we received from AMD just happens to be brown. Eventually, minty-fresh green will engulf the entire Athlon XP lineup.)
Of course, all of this shrinkage action has a purpose. The smaller the chips, the more chips AMD and Intel can manufacture per wafer. More chips per wafer means lower manufacturing costs, and ultimately, lower prices, too. AMD’s size advantage here is formidable, which ought to translate into a competitive advantage. However, Intel pulled a new trick out of its bag recently: it increased the size of its wafers from 200 mm2 to 300 mm2, and there’s some debate over who has the advantage in terms of manufacturing costs as a result. Whatever the case, know this: a processor price battle is coming. The latest round of price cuts has already gone mighty deep, and there’s more looming on the horizon.
Well, OK, maybe not looming on the horizon. More like hanging out over there, waiting to throw us a little party later on. With free beer and little cheese wedges with toothpicks in them. I can’t wait.
AMD should also be able to keep overall Athlon XP system costs down, because T-bred doesn’t require a new Socket or, by and large, even a new motherboard design. Usually a BIOS update will suffice; even some old KT133A boards will work with T-bred, though I’m not sure I see the point of that. AMD is quite proud of the relative stability of its Socket A platform in this respect.
Personally, I’m happy for them and everything, but I’d rather have a faster front-side bus than a killer (or probably overkill) CPU upgrade for my KT133A rig.
What to watch for in the test results
Now that we’ve introduced you to T-bred, it’s time to get down to business and see how this beast performs. Thing is, we already know this chip isn’t wildy different from its predecessor, so the only real difference we’d expect to see here come from the 66MHz clock speed increase between the Athlon XP 2100+ and 2200+from 1.73GHz to 1.8GHz.
Don’t expect a light show or anything.
Meanwhile, we’ll be comparing it to Pentium 4 chips with both 400 and 533MHz front-side bus speeds, which is a little more intriguing. We know the Athlon XP seriously outperforms the Pentium 4 on a clock-for-clock basis, but the higher bus speeds improve the Pentium 4’s ability to execute instructions on a per-clock basis. The Pentium 4 is faster at 2.4GHz on a 533MHz bus than at 2.4GHz on a 400MHz bus, especially when paired with fast memory. We’ll be interested to see how the Athlon XP matches up against the Pentium 4 now that the P4 is faster at a given speed.
Beyond that, the real question most folks are probably asking about T-bred is: How does it overclock? Does the die shrink bring immedate benefits to those of us willing to run things out of spec a little? We’ll delve into that question, as well.
Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.
Our test systems were configured like so:
|Athlon XP||Pentium 4 845||Pentium 4 850||Pentium 4 850E|
|Processor|| AMD Athlon XP 2100+ 1.73GHz
AMD Athlon XP 2200+ 1.8GHz
|Intel Pentium 4 2.4GHz||Intel Pentium 4 2.4GHz|| Intel Pentium 4 2.4GHz
Intel Pentium 4 2.53GHz
|Front-side bus||266MHz (133MHz double-pumped)||400MHz (100MHz quad-pumped)||400MHz (100MHz quad-pumped)||533MHz (133MHz quad-pumped)|
|Motherboard||Shuttle AK35GT2/R||Abit BD7-RAID||Intel D850MD||Intel D850EMV2|
|Chipset||VIA KT333||Intel 845||Intel 850||Intel 850E|
|North bridge||VT8367||82845 MCH||82850 MCH||82850E MCH|
|South bridge||VT8233A||82801BA ICH2||82801BA ICH2||82801BA ICH2|
|Chipset drivers||VIA 4-in-1
|Intel Application Accelerator 6.22||Intel Application Accelerator 6.22||Intel Application Accelerator 6.22|
|Memory size||512MB (2 DIMMs)||512MB (2 DIMMs)||512MB (4 RIMMs)||512MB (4 RIMMs)|
|Memory type||Corsair XMS3000 PC2700 DDR SDRAM||Corsair XMS2400 PC2100 DDR SDRAM||Samsung PC800 Rambus DRAM||Samsung PC800 Rambus DRAM|
|Graphics||NVIDIA GeForce4 Ti 4600 128MB (Detonator XP 28.32 video drivers)|
|Sound||Creative SoundBlaster Live!|
|Storage||Maxtor DiamondMax Plus D740X 7200RPM ATA/100 hard drive|
|OS||Microsoft Windows XP Professional|
I want to give a big thanks to Corsair for providing us with DDR333 memory for our testing. Their XMS3000 DIMMs allowed us to run the memory on our Shuttle AK35GT2/R test motherboard at CAS2 timings at 166MHz (that’s 333MHz DDR, kids). Good RAM didn’t hurt in our overclocking attempts, either. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering. Using it makes life easier for us as we’re dealing with brand-new chipsets and pre-production motherboards, because we don’t have to worry so much about stability and compatibility. The stuff flat works.
The test systems’ Windows desktops were set at 1024×768 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.
We used the following versions of our test applications:
- SiSoft Sandra Standard 2002
- Compiled binary of C Linpack port from Ace’s Hardware
- ZD Media Business Winstone 2001 1.0.2
- ZD Media Content Creation Winstone 2002 1.0
- POV-Ray for Windows version 3.5 beta RC3
- NewTek Lightwave 7.0b
- Sphinx 3.3
- ScienceMark 1.0
- LAME 3.91
- Xmpeg 4.5 with DivX Video 5.01
- MadOnion 3DMark 2001 SE
- Codecreatures Benchmark Pro
- Comanche 4 demo benchmark
- Serious Sam SE v1.05
All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.
We generally kick off our benchmark suite with some memory tests, and this time out the test underscore an important point.
As you can see, the Athlon XP has hit a brick wall in memory bandwidth; raising the CPU clock speed doesn’t help. Although our test system has DDR333 memory, memory performance is limited by the Athlon XP’s 266MHz front-side bus. In fact, the T-bred at 1.8GHz comes out a little slower than the Palomino does at 1.73GHz. I’m not quite sure why that is, but I ran this test a number of different times, just to be sure, and the T-bred was consistently just a tiny bit slower.
The Pentium 4, on the other hand, fares especially well with a 533MHz bus. Although we chose not to repeat the results here, if you look at this review, you’ll see that the P4 does very well with DDR333 memory, too.
Now let’s make a graph that looks vaguely scientific, so your boss won’t mind if he catches you reading this at work.
Linpack shows, from left to right, floating-point performance when processing data stored in the L1 data cache, the L2 cache, and then main memory. (The processor has to step down the hierarchy of memory types and speeds as the size of the data matrices Linpack is feeding it grows.) The move from 1.73GHz to 1.8GHz for the Athlon XP boosts performance when accessing on-chip caches, but it does nothing to help once we pass about 320K matrix sizes, beyond the domain of the Athlon XP’s combined L1 data and L2 caches.
Notice that once we pass matrix sizes of about 100K, every variety of Pentium 4 system in our test is faster than any Athlon XP. The Pentium 4 platform has a pronounced advantage in memory bandwidth from the L2 cache out into main memory.
However, bandwidth is only one of the two key components, generally speaking, of memory performance. The other is latency, and we haven’t run any latency oriented tests here. (We will next time out, honest.) Also, memory performance is itself only one piece of the overall performance picture.
The Athlon XP gains a couple of points from the 66MHz speed increase between the 2100+ and 2200+, but it’s not enough to catch the Pentium 4 2.4GHz.
Content Creation Winstone
Content Creation Winstone has been rewritten in its 2002 version to place more emphasis on memory performance, and it shows. The Athlon XP systems just can’t keep pace with the Pentium 4 rigs in this test, and bumping up the Athlon XP’s clock speed doesn’t help much.
POV-Ray 3D rendering
Here’s a test when the tide turns a little bit. Athlons have always womped Pentium 4 chips in raw x87 floating-point math performance, and that’s what POV-Ray rendering is all about. The T-bred’s higher clock speed shaves another 13 seconds off our render time versus the Athlon XP 2100+.
Lightwave 3D rendering
Lightwave is a nice demonstration of how 3D rendering performance can be enhanced using SIMD instruction set extensions. Lightwave uses Intel’s SSE2 extensions on the Pentium 4 (and on the Mac, it uses the G4’s AltiVec instructions) to speed the rendering process.
Obviously, this test isn’t really fair for the Athlon XP, but it is a real-world application we’re benchmarking, so Lightwave users won’t want to ignore these results.
I wrote NewTek, makers of Lightwave, to ask why their program doesn’t make use of SSE or 3DNow! extensionsboth of which the Athlon XP supportsin addition to SSE2 and AltiVec. Unfortunately, I never got an answer out of them. I also wondered aloud about this question in our last CPU review, and surprisingly, no one wrote in with any good technical explanation why SSE or 3DNow! support wouldn’t be helpful in Lightwave. Where are all the know-it-all geeks when you need them?
LAME MP3 encoding
Our previous LAME test setup was simply being run over by high-speed CPUs; they were crunching through an entire 50MB audio file in about 20 seconds, with only fractions of a second separating the fastest times. So this time around, we’ve beefed things up by using a 101MB source audio file and asking LAME to encode a high-quality variable bit rate MP3. The exact command-line options we used were:
lame -v -b 128 -q 1 file.wav file.mp3
This encoding task produced the following results:
The Pentium 4 at 2.53GHz takes the top spot yet again.
DivX video encoding
Xmpeg can encode video files using the popular DivX format, which produces very high quality video in relatively small amounts of space. For this test, we took a 279MB video file, encoded in MPEG2 format at DVD quality, and converted it to a 37MB DivX file.
Xmpeg supports all the various x86 SIMD instruction sets, including MMX, 3DNow!, SSE, SSE2even different flavors of 3DNow!, like 3DNow! Enhanced. Most importantly, perhaps, Xmpeg makes good use of the Pentium 4’s SSE2 instruction set, which offers potentially higher performance than the SSE or 3DNow! instructions supported by the Athlon XP.
One the Pentium 4 gets its SSE2 mojo going, there just no stopping it. The Athlon XP appears memory limited here; the clock speed increase is nearly useless.
Codecreatures Benchmark Pro
The Codecreatures benchmark is a graphical wonder; it pushes even more polygons than my 8th-grade geometry teacher, and it does so in conjuction with advanced graphics features like pixel shaders. We’ve wondered whether performance in this test is severerly limited by the graphics card. So is it?
Nope! The T-bred takes us to new heights in Codecreatures. No, the performance difference in terms of frames per second isn’t huge here, but when you’re hovering around 30 frames per second, every frame counts.
3DMark 2001 SE
The T-bred claws its way to an additional 100 points in 3DMark versus the Palomino, but the improvement isn’t quite enough to catch up with the Pentium 4 at 2.4GHz.
Serious Sam SE
The Athlon XP has owned Serious Sam benchmarks for ages. Only the jump to a 533MHz bus made the Pentium 4 competitive, and the T-bred nearly catches the P4 2.53GHz here.
Comanche 4 only benefits a little bit from T-bred’s faster clock speed.
Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.
Sphinx is very sensitive to memory performance, and it shows. Once again, T-bred isn’t much faster than Palomino, because it’s limited by the front-side bus bottleneck.
We’ll close out our testing with ScienceMark, which is always interesting in a very, very geeky sort of way. ScienceMark runs a series of physics simulations in an attempt to confuse non-geeks. Err, in an attempt to measure scientific computing performance.
AMD’s top processor takes the top spot here, as usual. Now for some of the individual test scores…
As we’ve seen before, the Pentium 4 takes the memory-intensive Primordia test, while the Athlon XP takes the other two.
Well, this is the part where I tell you about all of our marvelous overclocking exploits with the new, cool-running, die-shrunk processorhow we cranked it up 50% over its initial clock speed; how the Athlon XP now has more headroom than a convertible Buick; how our 3DMark scores shot up by hundreds of points with a few simple BIOS tweaks.
However, I can’t do that.
I can’t do that because the darned thing wouldn’t overclock for us. Not by much, at leastnot even by 100MHz. We were using mild bus speed overclocking, and we tried everything: core voltage tweaks, memory voltage tweaks, RAM timings more conservative than Gordon Liddy. Nothing helped enough to really matter.
At the end of the day, given the number of different ways we modified our system config and given the kinds of system crashes we were seeing, we could only come to one conclusion: the CPU just wouldn’t go any faster than about 1.89GHz. And even at that speed, it was on the ragged edge.
Now, overclocking is never a sure thing. Every chip is different, and you never know what will happen when you run a chip out of spec. So I’d better not draw any conclusions from our one-off, isolated experience with our very first T-bred sample. I shouldn’t speculate that AMD might be having trouble producing these chips with really good yields. And I really shouldn’t wonder out loud whether the Athlon XP’s 10-stage pipeline is hitting a snag at some point along the way that limits the chip’s peak clock speed. Most importantly of all, I shouldn’t mention the gossip I heard to that effect from other folks who had tested T- breds when I talked with them at Computex this past week. Especially not from engineers. I really, really shouldn’t do that.
Still, I can’t help but be a little bit worried about T-bred’s prospects given my experience. Certainly AMD could refine its fabrication process or tweak the T-bred core with a new stepping or two and make these things humup to 2GHz and beyond. Right now, however, our sample of this newly die-shrunk processor is running right at its clock speed limit, only 66MHz above its peak speed on AMD’s older fab process.
The benchmarks tell an interesting story. In some cases, despite all of its CPU and bus clock speed disadvantages, the Athlon XP is faster than any Pentium 4 processor. However, the Pentium 4 wins out in the majority of our tests. In many of the tests where the Athlon XP is slower than the P4, the 2200+ model’s 66MHz clock speed increase doesn’t deliver much more performance than the 2100+. Clearly, the Athlon XP’s 266MHz front-side bus is a big bottleneck; it can’t even keep up with the latest DDR333 memory, and DDR400 is already on the horizon.
Back when the Pentium III and Athlon were near the 1GHz mark, we saw this same problem: the Pentium III’s clock speed hit a wall at 1.13GHz, and its slower bus just couldn’t deliver extra performance from DDR memory. As a result, AMD took the performance lead and held on to it until the Pentium 4 came into its own. Now the tables have turned. The Athlon XP’s bus is a bottleneck, and we’re starting to wonder how well the chip will scale up to higher clock speeds. By contrast, the Pentium 4 is just getting started, and its newer design and platform give it a decisive edge. No wonder AMD has dedicated the bulk of its time and effort to bringing its K8 chip to market.
However, AMD has yet to relinquish the price-performance lead. AMD has led in this key category for ages, and given the Athlon XP’s solid performanceeven if it’s not the fastest in every testwe’ve found it hard not to recommend an Athlon XP to just about anyone.
This time out, AMD is playing an odd game with its pricing. If you consult the AMD price list and then the Intel price list, you’ll see that the Athlon XP 2000+ lists at the exact same price as the Pentium 4 2.0AGHz: $193. Though it’s not yet listed there, the Athlon XP 2200+ will follow a similar pattern; it will be priced at $241, the same price at the Pentium 4 2.26GHz. AMD is matching its prices to Intel’s using its model-number rating system as a guide.
That’s a dangerous policy, since the Athlon XP isn’t scaling up as well as the Pentium 4. Plus, the new P4 chips with 533MHz bus speeds are faster, clock for clock, than the 400MHz bus versions. In fact, AMD’s model numbering scheme may need adjustment for future Athlon XP models. (I wish we’d have had time to benchmark the 2.2GHz and 2.26GHz variants of the Pentium 4 here today, so you could see a direct comparison between AMD’s model number and Intel’s clock speed. However, we were too busy with Computex this past week to make it happen.)
AMD really knows better than to price match Intel, however. If you check street prices on Pricewatch, for instance, you’ll find that the Pentium 4 2.0AGHz is selling for somewhere north of $197, while the Athlon XP 2000+ is available for as little as $161. AMD seems to be applying a discount in reality, even if the price list doesn’t reflect it.
Intel has its own discounts, tooespecially for big system builders like Dell or HP. So if you plan on buying a pre-built PC, shop carefully. Of course, if you plan on doing that, well, ick. What are you thinking?
For those of us building our own boxes, the Athlon XP 2200+ is a pretty good value. However, if you’re looking to overclock, you’re probably better off with a lower speed grade of a Northwood Pentium 4, like the Pentium 4 1.6A. Those chips have a decent chance of hitting at least 2.13GHz on a 133MHz bus. If you’re not looking to overclock, by all means, check out the Athlon XP 2200+. But you might want to look closely at our benchmarks before you make up your mind; which CPU is better depends quite a bit on how you’re using it.