AMD’s Athlon 64 processor

AT LONG LAST, AMD is ready to introduce its Athlon 64 processor to the world. This chip is the successor to the formidable K7 core, known to most as the AMD Athlon processor. The Athlon cemented AMD’s place as a respectable second source for x86 processors, a position shakily established by the K6 before it. This time around, AMD has chosen to bolster the K7 core with a number of key internal enhancements and a radical reworking of the PC’s internal plumbing. This new processor core, code-named Hammer, is intended to solidify AMD as a leader not just in desktops, but in servers and workstation, as well. AMD has even amended the x86 instruction set for the future with 64-bit extensions.

We’ve spent the past few weeks testing Hammer-based chips, including the Athlon 64 and its new big brother, the mighty Athlon 64 FX-51. Not only that, but Intel slipped us a Pentium 4 3.2GHz Extreme Edition processor at the last minute, and we’ve benchmarked it, as well—2MB L3 cache and all. We can honestly say we were blown away by the performance of these new chips. The future is now. Read up to see how good it can be.

Hammer comes to the desktop
The Hammer CPU core is an evolutionary design based AMD’s K7 microarchitecture. Nonetheless, Hammer is revolutionary, not so much because of what goes on inside the chip itself, but because of how it talks to the rest of the computer. We have dedicated most of our time and effort in preparing this review to empirical testing, so we’re not able to cover Hammer’s architectural innovations in as much detail as we’d like. Still, we’ll hit some of the major points that make AMD’s new processor distinctive. Among them:

  • An integrated memory controller — Conventional systems have long had a memory controller located on a “north bridge” chips that talks to the processor over a front-side bus. Hammer chips have the memory controller built in, so the processor talks to the memory controller directly at the full speed of the CPU—2.2GHz in the case of the fastest Athlon 64. (The memory controller itself also runs at the speed of the CPU.) As a result, Hammer processors can access memory with very low latencies, opening up one of the most persistent bottlenecks to overall system performance. Current Hammer chips have DDR memory controllers compatible with memory speeds up to 400MHz.

    Beyond the basic performance benefits, the movement of the memory controller on die has implications for the organization of the entire Hammer platform. Core logic chipsets no longer need to provide memory controllers, and the Hammer, strictly speaking, has no traditional front-side bus. Even more mind-bendingly, multiprocessor Hammer systems have individual banks of memory for each processor, so they should scale very well as processors are added.

  • HyperTransport communications — HyperTransport is the glue that makes AMD’s reorg of the traditional PC work. A packet-based chip-to-chip interconnect, HyperTransport links are pairs of 8-bit or 16-bit unidirectional links running at speeds up to 800MHz. Throw in a little DDR action, sending data twice per clock cycle, and you have an effective clock rate of 1.6GHz per link. As implemented in Hammer, HyperTransport links have a maximum throughput of 6.4GB/s (16 bits upstream plus 16 bits downstream at an effective 1.6GHz).

    Hammer systems use HyperTransport for several things. In all Hammer systems, one of the CPUs (or the only CPU) talks to the rest of the system over a HyperTransport link. Traditional chipset services like AGP, PCI, and south bridge I/O are delivered over this link much like VPN tunnels are delivered over TCP/IP connections in a computer network. Done right, HyperTransport should simplify motherboard design by replacing slower and wider connections that require more traces to achieve similar results. In multiprocessor implementations, HyperTransport links between processors allow for inter-chip communications, as well.

  • 1MB of on-chip L2 cache — Previous versions of the K7 core have had varying amounts of L2 cache, up to 512K on the most recent “Barton” Athlon XPs. The first wave of Hammer chips all come with 1MB of L2 cache onboard, upping the ante by a factor of two.

    Hammer’s L1 cache sizes are unchanged from K7 at 64K for instructions and 64K for data. AMD’s caches tend to be exclusive, and that’s the case with Hammer; these caches don’t replicate the contents of the L1 cache. With the L1 data and L2 caches combined, the Hammer chips’ total effective data cache size is 1088K.

    We’ve seen many times before the impact larger caches have on performance. Generally, more cache is better, but many tasks pull through too much data to derive any benefit from extra cache, so the benefits are uneven.

  • SSE2 instruction set support — Intel introduced the SSE2 instruction set for single-instruction, multiple-data (SIMD) calculations with the Pentium 4. SSE2 allows for SIMD operations on 128-bit IEEE double-precision floating-point datatypes, so it’s useful in tasks like 3D rendering, graphics drivers, gaming, and media encoding. Previous Athlon chips have supported SIMD instruction set extensions for both integer (MMX) and single-precision floating-point (SSE, 3DNow!) operations, but they have been missing SSE2, where SIMD on x86 arguably has the most impact. The Hammer core can take advantage of applications optimized for SSE2, making it more competitive with the Pentium 4.
  • AMD64 instruction set support — In a gutsy move, AMD has concocted its own set of extensions to the x86 instruction set architecture, or ISA. The new AMD64 ISA isn’t a radical departure, but it allows for 64-bit addressing, and it adds some additional registers to the register-poor x86 ISA.

    AMD’s move to 64 bits accomplishes several things. First, it eliminates the barrier of 4GB of addressable memory in 32-bit systems. 4GB may sound like a lot today, but as an upper limit, 4GB could become a nasty constraint, even on common desktop systems, in the next few years.

    Second, by adding 32-bit extensions to the x86 ISA, AMD has created an evolutionary alternative to Intel’s Itanium chips, which break almost entirely with the industry-standard x86 software infrastructure. Naturally, code will have to be recompiled for AMD64, but AMD64 is familiar enough that retooling compilers for it should be relatively painless.

    Finally, AMD64’s additional registers, which are present in Hammer, promise better performance on recompiled code. (Registers are essentially temporary local storage slots on a processor. More of them means less storing data in cache or memory.) Addressing memory in 64-bit chunks won’t, by itself, necessarily improve performance. The Hammer has eight new 64-bit integer registers and eight new 128-bit SSE/SSE2 registers to help.

  • A slightly longer pipeline — Hammer’s main branch prediction/recovery pipeline has been lengthened from 10 stages to 12. This change should allow the processor to run at higher clock rates at the expense of executing fewer instructions per clock. The Hammer’s pipeline is still much shorter than the 20-stage main pipeline in the “speed demon” Pentium 4.
  • 0.13-micron SOI fab process — All Athlon 64 processors are manufactured at AMD’s fab in Dresden, Germany, where AMD employs an advanced silicon-on-insulator (SOI) fabrication process to make these chips. Laying down the silicon on top of an insulator should allow the chip’s transistors to operate faster. IBM, who pioneered SOI technology, claims clock frequency gains from SOI as high as 35 percent in testing. However, transitions to new chip fabrication processes are fraught with potential snags. When AMD delayed the Athlon 64’s launch this past spring, the company cited difficulties producing chips in volume using SOI technology as the primary culprit and had to turn to IBM for assistance.

    The move to SOI is crucial because AMD’s enhancements to Hammer add up to a whole lot more transistors per chip than the K7. The last revision of the Athlon XP, code-named Barton, had 54.3 million transistors and a die size of 101 square millimeters. The Northwood Pentium 4 has 55 million transistors on a die that’s 145 square millimeters. By contrast, the Athlon 64 packs 105.9 million transistors onto a 192 square millimeter die.

Those are the high points of AMD’s remodeling job. As we’ve noted, these changes have wide-ranging implications for Hammer motherboards, chipsets, operating systems, and software. We will, of course, be testing the performance implications shortly.

 

ClawHammer, SledgeHammer—JackHammer? BallpeenHammer? TackHammer?
AMD is introducing Athlon 64 chips based on two Hammer variants today, code-named ClawHammer and SledgeHammer. The principal difference between Claw and Sledge is the width of the chip’s connection to memory. ClawHammer’s memory controller has a single, 64-bit path to RAM, while SledgeHammer’s is a dual-channel or 128-bit design.

AMD originally planned for all Athlon 64 chips to be based on ClawHammer, while Opteron workstation/server processors would be based on Sledge. Turns out, though, AMD has decided to intro a top-of-the-line desktop chip with a dual-channel memory controller called the Athlon 64 FX. (Somebody phone NVIDIA marketing!) This chip is essentially a remarked Opteron running at 2.2GHz. To be more specific, the 2.2GHz flavor is dubbed “Athlon 64 FX-51”, and it gets no other designation. FX-51. That’s it. AMD figures the folks who will be willing to cough up the $733 list price for this baby will know how it performs from having read publications like this one, so there will be no Pentium 4 equivalency games played here.

If you want to play those games, you can pick up a non-FX Athlon 64 like the Athlon 64 3200+. These ClawHammer-based products have a 64-bit path to memory and come in the 754-pin package originally intended for all Athlon 64s. AMD will initially be selling the Athlon 64 3200+, which runs at 2GHz, for $417—a veritable bargain compared to the FX model.


The Athlon 64 FX (left) comes in a ceramic package, while
the Athlon 64 3200+ wears an organic package


With 940 pins, Athlon 64 FX (left) would make a good brush.
The Athlon 64 3200+ (right) sports only 754 pins.

Because of AMD’s late decision to go with a SledgeHammer-based desktop chip, the Athlon 64 FX drops into Opteron motherboards with 940-pin sockets like the Asus SK8N. Of course, this fact messes up AMD’s careful market segmentation plans, especially since it now looks like the future of Athlon 64 is 128-bit memory interfaces. To remedy this problem, Athlon 64 FX processors will soon get their very own, physically incompatible 939-pin socket. To aid in the infrastructure transition, Athlon 64 FX chips will be available in both 940-pin and 939-pin packages for the duration of 2004.

Word has it AMD may introduce a separate, 941-pin package later this year just out of spite.

I kid. I kid.

These market segmentation games don’t bother me too much, so long as AMD delivers a quality product. Motherboard manufacturers, however, may be a different story. I expect many of them are scrambling right now to prepare 939-pin motherboards for use with upcoming Athlon 64 FX chips.

If you’re thinking the prices on these Athlon 64 chips sound steep, you’re thinking right. AMD is only introducing two models of the Athlon 64, and the cheap one costs north of four hundred bucks. There are several possible reasons, not mutually exclusive, for these high prices. AMD says it wants to end the practice of pricing its chips below Intel’s when AMD’s chips are technically superior. To that end, the company says it’s positioning the Athlon 64 against Intel’s upcoming Prescott chips. The Athlon XP can continue to battle it out with existing Pentium 4 chips, and the Athlon 64 FX will sit alone atop the desktop performance throne. Like so:


Source: AMD.

This looks like wishful thinking to me, but perhaps it will help AMD raise its average selling prices, even if the strategy doesn’t succeed entirely. I can’t help but think the most important reason for high prices on the Athlon 64 has to do with limited supply. Price can be a very effective rationing system, and the fact AMD isn’t introducing slower, lower-cost Athlon 64 chips suggests such rationing may be necessary at present.

 

Intel’s extreme measures
If AMD can pull a workstation/server-class chip into the desktop market in order to capture the performance lead, Intel should be able to do the same, right? Well, that’s exactly what Intel decided to do, and apparently the decision was made very recently. Late last week, the Pentium 4 3.2GHz Extreme Edition arrived at my doorstep, innocently proclaiming its willingness to be benchmarked against whatever AMD had to offer. This processor is basically a rebadged Xeon MP chip, but then again the Xeon is just a rebadged Pentium 4, so it all comes back around somehow. The net result is a Pentium 4 that runs at 3.2GHz with the usual 512K of L2 cache—plus a whopping 2MB of on-chip L3 cache. Unlike true Xeon chips, the Extreme Edition fits into plain ol’ Socket 478 Pentium 4 motherboards. We plugged it into our Abit IC7-G test mobo, and it worked fine without need of a BIOS update or anything else.


The Pentium 4 3.2GHz Extreme Edition proclaims its newness with handwritten markings

I mentioned earlier that the additional cache on the Athlon 64 should help performance in some applications, but not in all of them. The same is true for the Pentium 4 Extreme Edition, but with a larger cache, there’s a better chance an application’s working data set will fit inside it. Intel is aiming at gamers with this chip, as AMD is with the Athon 64 FX-51, and that fact alone should tell you something about how the added cache is likely to affect performance.

Intel says the Extreme Edition should ship around about November, which is just the right time to ship a new model space heater. At 178 million transistors, this Edition is most definitely Extreme. The die size is an impressive 237 square millimeters, or just about the size of Vermont. Still, the Extreme Edition seems like a sweet proposition. With Hyper-Threading and all of that cache, the user experience just puttering around on the desktop or playing with productivity apps should be creamy smooth indeed. The Extreme’s price has yet to be announced.

By the way, we have no qualms about Intel horning in on AMD’s product launch. AMD has done the same to Intel in the past, and besides, Intel’s decision to pull a killer server chip into the desktop market is very much welcome to us. In the end, consumers benefit. There is also the distinct possibility that the Extreme Edition is Intel’s way not just of competing, but of avoiding embarrassment. You’ll see what I mean by that when we get into the benchmark scores shortly.

The question I have is whether Intel will remain committed to future Extreme Edition processors. AMD has proclaimed its commitment to keeping the Athlon 64 FX on top. In fact, AMD has been releasing low-volume high-end parts since a year ago, with the T-bred 2800+ chip, which was never available via retail. These products aren’t a good value proposition, but they do give well-funded enthusiasts a chance to grab the latest technology before everybody else. I’m curious to see whether Intel will play this game long term.

 

Not-so-core logic
The Athlon 64 arrives with the support of two decent third-party chipsets—or at least what’s left of the chipset, now that the memory controller has moved onto the CPU. Available now, NVIDIA’s nForce3 150 chipset is a single-chip design with AGP and south bridge I/O functions rolled into one. When used with an Opteron or Athlon 64 FX, NVIDIA calls it the nForce3 Pro 150, but I’m pretty sure the Pro and non-Pro chips are one and the same. Interestingly, NVIDIA hasn’t included its much-ballyhooed Audio Processing Unit (APU) on the nForce3, so unaccelerated AC’97 audio is all you get. We’re left to wonder what will become of the APU.

To coincide with the Athlon 64 launch, NVIDIA is introducing its ForceWare software, which will roll up all of NVIDIA’s platform software under a single brand name. ForceWare will add RAID 0, 1, and 0+1 capabilities to nForce3 150’s three ATA/133 controllers, among other things.

Planned for later this year is nForce3 250, a revised chipset with support for Serial ATA (including RAID), eight USB 2.0 ports, and optional Gigabit Ethernet. nForce3 Go will add power management for laptop computers and NVIDIA integrated graphics.

VIA’s K8T800 is a dual-chip design with distinct north and south bridges for easy upgrades to either chip. The K8T800, which is now shipping, will go into everything from desktops to workstations to servers. This chipset offers Serial ATA RAID and eight USB ports courtesy of VIA’s new VT8237 south bridge. Also, the K8T800 has a full 16-bit, 800MHz implementation of HyperTransport that VIA has dubbed Hyper8. With 6.4GB/s of bandwidth between chipset and CPU, the K8T800 has a theoretical advantage over the nForce3 150’s 3.6GB/s.


A block diagram of a typical K8T800-based desktop system. Source: VIA.

Incidentally, when we set out to review the Athlon 64, we tested the Athlon 64 FX-51 with the nForce3 Pro 150-based Asus SK8N motherboard. This was the test system configuration shipped to us by AMD. However, after seeing the performance of the K8T800, especially in gaming and graphics, we decided to retest with the VIA chipset, as well. Both sets of results are included in our review. We had no intention of doing a platform comparison in our processor review, but I think you’ll understand why we chose to include the K8T800.

VIA’s plans for Athlon 64 chipsets also include the K8M800 with integrated graphics for low-cost desktops and the K8N800 for mobile applications.

About the 64-bit Windows pre-beta
You will notice that we tested the Athlon 64 FX-51 in some applications with a pre-beta version of Windows XP 64-bit Edition. The very fact that Microsoft allowed AMD to ship systems to reviewers with an unreleased 64-bit version of Windows says good things about Microsoft’s commitment to AMD64 support. The pre-beta version of Windows XP 64-bit Edition is still very much a work in progress, and not everything works perfectly yet. For instance, the NVIDIA video drivers didn’t seem to have proper OpenGL acceleration, and we didn’t have access to a 64-bit version of DirectX 9. As a result, we tested only text applications and a few Direct3D 8.x games.

Of course, these applications are all 32-bit programs, so they don’t take advantage of extra registers or memory address space. They are just 32-bit programs running in the Windows-on-Windows facility in the 64-bit edition of WinXP. They should provide an interesting preview of performance in 64-bit Windows, but I expect performance will improve as the OS nears its final form.

 

Cooling the beast
I’d like to take a second to show you AMD’s new cooler retention mechanism, because I ‘m fairly impressed with it. The pictures below will give you the general idea. This is the stock cooler from AMD. One came in our test system, and one came with each of the two retail Opteron 240s we recently purchased.


The cooler’s lever arm is clipped closed


Release the lever and the tension


Use a screwdriver to remove the clip, and…


That was easy!


A plate under the mobo holds the cooler retention bracket in place

I’ve broken more than my share of stock Intel Pentium 4 coolers, but I was able to get the hang of the new AMD retention mech after just a few uses. Also, the plate on the underside of the mobo seems to help dissipate heat, which is, erm, cool.

 

What to watch for in the benchmark results
There are plenty of storylines here, and I can’t mention them all. However, you will want to watch for several things. First, of course, there’s AMD’s two new processors, the Athlon 64 3200+ and Athlon 64 FX-51. How much of an improvement over the K7 are these new Hammer chips?

Also, we’ve included results for an Opteron 146 chip. This is a SledgeHammer running at 2.0GHz, so it provides a pretty direct comparison to the Athlon 64 3200+, which runs at the same speed but has only one memory channel. Then again, the Athlon 64 3200+ seems to be on a faster motherboard, so that comparison is a little wobbly.

We’ll be keenly interested to see how the Pentium 4 3.2GHz Extreme Edition fares. Will it really offer a worthy improvement over the Pentium 4 3.2GHz, and more importantly, can it beat the Athlon 64 FX-51 for the top spot?

Back down on earth, the battle between the Pentium 4 3.2GHz and the Athlon 64 3200+ may be more interesting still, because folks might actually buy these chips. Can AMD capture the performance lead at the top of the mainstream desktop market from Intel?

Finally, I spent some extra time in the test lab to make sure we had results for slower speeds of Pentium 4 and Athlon XP, so you can see how truly fast the new high-end chips are. Check out the performance delta from low end to high.

Our testing methods
As ever, we did our best to deliver clean benchmark numbers. Tests were run at least twice, and the results were averaged.

Our test systems were configured like so:

Processor Athlon XP ‘Barton’ 3200+ 2.2GHz Athlon XP ‘Barton’ 2500+ 1.83GHz
Athlon XP ‘Barton’ 2800+ 2.183GHz
AMD Athlon 64 3200+ 2.0GHz AMD Athlon 64 FX-51 2.2GHz AMD Opteron 146 2.0GHz
AMD Athlon 64 FX-51 2.2GHz
Pentium 4 2.4 ‘C’ GHz
Pentium 4 2.8GHz
Pentium 4 3.2GHz
Pentium 4 3.2GHz Extreme Edition
Front-side bus 400MHz (200MHz DDR) 333MHz (166MHz DDR) HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
HT 16-bit/800MHz downstream
HT 16-bit/800MHz upstream
HT 16-bit/600MHz downstream
HT 8-bit/600MHz upstream
800MHz (200MHz quad-pumped)
Motherboard Asus A7N8X Deluxe v2.0 Asus A7N8X Deluxe v2.0 MSI K8T Neo MSI 9130 Asus SK8N Abit IC7-G
North bridge nForce2 SPP nForce2 SPP K8T800 K8T800 nForce3 Pro 150 82875P MCH
South bridge nForce2 MCP-T nForce2 MCP-T VT8237 VT8237 82801ER ICH5R
Chipset drivers nForce Unified 2.45 nForce Unified 2.45 4-in-1 v.4.49
ATA 5.1.2600.10
Audio 5.10.0.5920
4-in-1 v.4.49
AGP 4.42
Audio 6.14.1.3870
AGP 3.34
ATA 3.44
Audio 5.10.0.5100
INF Update 5.0.1015
ATA 5.0.1007.0
Audio 5.10.0.5250
BIOS revision 1005 1005 1.0 1.0 1002 1.6
Memory size 1GB (2 DIMMs) 1GB (2 DIMMs) 768MB (3 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs) 1GB (2 DIMMs)
Memory type Corsair TwinX XMS4000 DDR SDRAM at 400MHz Corsair TwinX XMS4000 DDR SDRAM at 333MHz Corsair XMS3200 DDR SDRAM at 400MHz Infineon PC3200 registered ECC DDR SDRAM at 400MHz Infineon PC3200 registered ECC DDR SDRAM at 400MHz Corsair TwinX XMS4000 DDR SDRAM at 400MHz
Hard drive Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB SATA 150 Seagate Barracuda V 120GB SATA 150 Seagate Barracuda V 120GB ATA/100 Seagate Barracuda V 120GB SATA 150
Audio nForce2 MCP/ALC650 nForce2 MCP/ALC650 VT8237/ALC650 VT8237/ALC201A nForce3 Pro/ALC650 ICH5/ALC650
Graphics GeForce FX 5900 Ultra
OS Microsoft Windows XP Professional
OS updates Service Pack 1, DirectX 9.0b

Sorry about the 768MB of RAM in the Athlon 64 3200+ system. I couldn’t get it to boot with either pair of 512MB DDR400 DIMMs I had on hand, and its motherboard had only three DIMM slots, so 768MB was as close as we could come. I don’t belive this difference in memory size should affect any of the benchmarks we used.

All tests on the Pentium 4 systems were run with Hyper-Threading enabled.

Thanks to Corsair for providing us with memory for our testing. If you’re looking to tweak out your system to the max and maybe overclock it a little, Corsair’s RAM is definitely worth considering.

The test systems’ Windows desktops were set at 1152×864 in 32-bit color at an 85Hz screen refresh rate. Vertical refresh sync (vsync) was disabled for all tests.

We used the following versions of our test applications:

All the tests and methods we employed are publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them.

 

Benchmark results

Memory performance
By tradition, we start our CPU tests with synthetic memory performance benchmarks, in part to keep them separate from real-world benchmarks, and in part because they’re generally where the action is when a truly new processor arrives. Such is the case today. The Athlon 64’s integrated memory controller promises great things, so let’s dive in and see if we can measure its impact.

The Athlon 64 FX-51 shows exactly why AMD decided to incorporate the memory controller into the processor. The FX-51 leads all contenders with over 5.5GB/s of memory bandwidth in Sandra. The Athlon 64 3200+, meanwhile, performs very well for a single-channel solution, just edging out the Athlon XP 3200+ in Sandra, and seriously outrunning it in cachemem’s bandwidth test.

The P4 Extreme Edition doesn’t show much benefit over the stock P4 in test of memory bandwidth, but that’s expected.

Linpack can show us L1 and L2 cache hierarchies at work, as well as real-world memory bandwidth. That orange line you see flying off the edge of the graph is the P4 Extreme Edition, whose L3 cache is larger than our data matrix sizes. We’re gonna have to adjust our tests if this keeps up.

Among the more sane results, you can see the extended size of the Athlon 64’s L2 cache compared to the Athlon XP, and you can see how the Athlon 64 FX-51 promises the most sustained bandwidth when matrix calculations spill into main memory.

The Extreme Edition’s massive cache forced us to choose a different data point for our latency sample. Even with a 4MB block, the Extreme’s big caches help mask memory access latency, as one might expect with the pre-fetching algorithms of the Pentium 4.

Nevertheless, the Athlon 64 chips destroy the competition here. The integrated memory controller appears to shave over 25 nanoseconds off memory access. But, my friends, that’s just one sample point. Let’s look at them all.

 
Memory performance (continued)
Not only are our 3D graphs indulgent, but they’re useful, too. I’ve arranged them manually in rough order from worst to best, for what it’s worth. I’ve also colored the data series according to how they correspond to different parts of the memory subsystem. Yellow is L1 cache, light orange is L2 cache, and orange is main memory. The red series on the Extreme Edition graph represents L3 cache. Of course, caches sometimes overlap, so the colors are just an interesting visual guide.

The Athlon 64 FX-51 and the Opteron 146 require registered DIMMs, but the Athlon 64 3200+ does not. Probably as a result of that difference, the Athlon 64 3200+ achieves the lowest memory access times.

The Pentium 4 Extreme Edition remains… extreme.

 

Unreal Tournament 2003

The Athlon 64 chips are worldbeaters in Unreal Tournament, well ahead of the Pentium 4 3.2GHz. The P4 Extreme Edition’s extra cache helps in UT2003, and its third-place finish (behind the two FX-51 setups) maintains some respectability for Intel.

Quake III Arena

I guess you can see now why the P4 Extreme Edition makes sense. A total of four different AMD Hammer configs finish ahead of the Pentium 4 3.2GHz, but the Extreme Edition leads them all, loading up large chunks of Quake III into its L3 cache and going to town. Nevertheless, the Athlon 64s show their mettle here, outrunning the P4 3.2GHz.

You can begin to see the difference between the K8T800 and the nForce3. With the Athlon 64 FX-51, the K8T800 pulls ahead of the nForce3 Pro by over 10 frames per second. The 2GHz Hammer chips are even more lopsided, with the single-channel Athlon 64 3200+ on the K8T800 trouncing its dual-channel counterpart, the Opteron 146, on the nForce3 Pro. The Athlon 64 3200+ somehow even manages to beat the Athlon 64 FX-51.

Wolfenstein: Enemy Territory

Wolfenstein: Enemy Territory is a tight race, but the Athlon 64 FX-51 manages to take the top spot when paired up with the K8T800.

 

Comanche 4

We couldn’t have asked for more drama, even if we were producers of a reality show on Fox. The Extreme Edition nips the Athlon 64 FX by a hair. That L3 cache is good for about 8 frames per second in Comanche 4, and that’s just enough to do the trick.

Serious Sam SE

The Hammer processors rule in Serious Sam. Look, also, at the performance delta from top to bottom here. That’s huge.

3DMark03

3DMark’s overall score is driven primarily by video card performance, but the Pentium 4 chips are a bit faster in this test.

The CPU tests are another story. The Athlon 64 chips sweep CPU test 1. CPU test 2 seems to be more memory bandwidth oriented, and the Extreme Edition performs well in it.

 

Sphinx speech recognition
Ricky Houghton first brought us the Sphinx benchmark through his association with speech recognition efforts at Carnegie Mellon University. Sphinx is a high-quality speech recognition routine that needs the latest computer hardware to run at speeds close to real-time processing. We use two different versions, built with two different compilers, in an attempt to ensure we’re getting the best possible performance.

There are two goals with Sphinx. The first is to run it faster than real time, so real-time speech recognition is possible. The second, more ambitious goal is to run it at about 0.8 times real time, where additional CPU overhead is available for other sorts of processing, enabling Sphinx-driven real-time applications.

All but two of our test systems finish below 0.8 times real time, which must mean Sphinx is ready to deploy. The graph’s sort order is a little deceptive, because it’s sorted by the Microsoft compiler results, and the Athlon chips perform better with the Intel compiler. (Go figure.) In terms of absolute performance, the Athlon 64 FX on the K8T800 is actually the third fastest config, right behind the Pentium 4 3.2GHz Extreme and the regular P4 3.2GHz.

LAME MP3 encoding
We used LAME 3.92 to encode a 101MB 16-bit, 44KHz audio file into a very high-quality MP3. The exact command-line options we used were:

lame –alt-preset extreme file.wav file.mp3

The Pentium 4 has always performed well at media encoding tasks, and that’s the case here. The Extreme Edition’s L3 cache doesn’t help at all, though, nor does the Athlon 64 FX’s integrated memory controller.

DivX video encoding
Xmpeg is partially self-tuning, and we noted that it chose the SSE2 Optimized iDCT on the Hammer processors.

You can see how poorly the K7 chips handle DivX encoding compared to the Pentium 4. The Hammer processors close the gap substantially, but they can’t quite catch the 3.2GHz P4s.

 

3ds max rendering
We begin our 3D rendering tests with Discreet’s 3ds max, one of the best known 3D animation tools around. 3ds max is both multithreaded and optimized for SSE2. We rendered a couple of different scenes at 1024×465 resolution, including the Island scene shown below. Our testing techniques were very similar to those described in this article by Greg Hess. In all cases, the “Enable SSE” box was checked in the application’s render dialog.

SSE2 and Hyper-Threading are a potent combo in 3D rendering, as the Pentium 4 3.2GHz CPUs prove. Still, the Athlon 64 FX holds the top spot in the Earth-Apollo scene.

 

Lightwave rendering
NewTek’s Lightwave is another popular 3D animation package that includes support for multiple processors and is highly optimized for SSE2. Lightwave can render very complex scenes with realism, as you can see from the sample scene, “A5 Concept,” below.

Also, I should note that in our recent workstation PC comparo we tried a number of different threads for the rendering engine in an attempt to exploit Hyper-Threading. At the time, we thought we were getting the best scores with multiple concurrent threads on the Pentium 4, but on further investigation, that turned out not to be the case. Lightwave uses SSE2 well enough that more threads don’t really help, or so it seems. All the results below are single-threaded.

Adding SSE2 support was a big win for the Athlon 64. The top Pentium 4 chips still post the lowest render times, but look at the massive render time differences between the 2.2GHz Athlon XP 3200+ and the 2.2GHz Athlon 64 FX-51.

 

POV-Ray rendering
POV-Ray is the granddaddy of PC ray-tracing renderers, and it’s not multithreaded in the least. Don’t ask me why—seems crazy to me. POV-Ray also relies more heavily on x87 FPU instructions to do its work, because it contains only minor SIMD optimizations.

We tested with old “chess2.pov” scene we’ve been using forever. In a recent article, we also tested with the “official” POV-Ray benchmark, but time constraints prevented us from including it here. I also believe our “chess2” scene is more representative of everyday POV-Ray performance than the “official” benchmark scene.

In this x87-intensive renderer, the AMD chips are the clear winners. I think the Pentium 4s might give them a run for their money if POV-Ray were multithreaded, though.

 

Cinebench 2003 rendering and shading
Cinebench is based on Maxon’s Cinema 4D modeling, rendering, and animation app. This revision of Cinebench measures performance in a number of ways, including 3D rendering, software shading, and OpenGL shading with and without hardware acceleration.

Cinema 4D’s renderer is multithreaded, so it takes advantage of Hyper-Threading. For the AMD-based systems, I’ve reported the single-processor results. For the P4 systems, I’ve reported the multi-threaded results, which in all cases were notably faster.

Cinema 4D likes the Hyper-Threading. The Athlon 64 systems can’t quite compete with that.

The Athlon 64 FX makes up for its loss in the first test by sweeping the rest of them.

 

SPECviewperf workstation graphics
SPECviewperf simulates the graphics loads generated by various professional design, modeling, and engineering applications.

Notice here the contrast between the Athlon 64 FX with the K8T800 and with the nForce3 Pro. With the K8T800, the Athlon 64 FX is arguably the fastest system overall in the viewperf suite. The nForce3 Pro, however, seems to limit performance quite a bit.

Also, here’s a case where the Pentium 4 Extreme Edition’s L3 cache doesn’t seem to help much. That’s almost surprising, because it tends to help much more often than not.

 

ScienceMark
I’d like to thank Alex Goodrich for his help working through a few bugs the 2.0 beta version of ScienceMark. Thanks to his diligent work, I was able to complete testing with this impressive new benchmark, which is optimized for SSE, SSE2, 3DNow! and is multithreaded, as well.

In the interest of full disclosure, I should mention that Tim Wilkens, one of the originators of ScienceMark, now works at AMD. However, Tim has sought to keep ScienceMark independent by diversifying the development team and by publishing much of the source code for the benchmarks at the ScienceMark website. We are sufficiently satisfied with his efforts, and impressed with the enhancements to the 2.0 beta revision of the application, to continue using ScienceMark in our testing.

The molecular dynamics simulation models “the thermodynamic behaviour of materials using their forces, velocities, and positions”, according to the ScienceMark documentation. Sounds simple enough, right?

Primordia “calculates the Quantum Mechanical Hartree-Fock Orbitals for each electron in any element of the periodic table.” In our case, we used the default element, Argon.

The next test measures performance in AES encryption.

The Hammer core excels in the classical computing algorithms above. Matrix multiplication with BLAS may be a different story, however. Notice that ScienceMark’s BLAS tests are highly optimized using x87 assembly, SSE, SSE2, and 3DNow! as appropriate. As a result, these tests are probably a much better indicator of matrix multiplication performance than the version of Linpack we use primarily to measure memory bandwidth.

The Pentium 4 achieves the highest peak throughput in both single-precision (SGEMM) and double-precision (DGEMM) floating-point calculations with proper use of SSE and SSE2. However, the Athlon 64 processors are more amenable to various types of optimizations, and they perform best with the compiled C code, as well. Interestingly enough, in DGEMM, the Hammer chips appear to achieve near-peak performance with three different types of code, two scalar and one vector. They don’t seem to care how the data is organized, whereas the Pentium 4 responds much better with vectorization.

 

picCOLOR image analysis
We thank Dr. Reinert Muller with the FIBUS Institute for pointing us toward his picCOLOR benchmark. This image analysis and processing tool is partially multithreaded, and it shows us the results of a number of simple image manipulation calculations. The overall score is indexed to a Pentium III 1GHz system based on a VIA Apollo Pro 133. In other words, the reference system would score a 1.0 overall.

The Athlon 64 FX is over three times the speed of Dr. Muller’s reference system in picCOLOR, way ahead of the nearest Pentium 4. Let’s look at some selected results from the individual picCOLOR tests to see why that is.

The Athlons all excel at the GraphCopy and AddressMem functions, as well as Fixed Interpolation. Otherwise, the Pentium 4 processors are very competitive. The Athlon 64 3200+ trips up in the video tests—this is a familiar problem with the K8T800 chipset in this test. We’re unsure what the cause is.

 
Conclusions
AMD’s Athlon 64 processors are very impressive performers. They inherit all the strengths of the Athlon XP, but few of the weaknesses. For a long while, the give-and-take between the Pentium 4 and Athlon XP involved a kind of imbalance, with the Pentium 4 dominating in certain types of benchmarks while the Athlon XP dominated in others. No more. With very fast memory access and SSE2 support, the Athlon 64 chips match up well against the P4 in nearly every way. Our set of benchmarks is a little heavy on 3D rendering, where optimizations for SSE2 and Hyper-Threading bolster the Pentium 4, but overall, the Athlon 64 FX-51 stakes a strong claim to the title of fastest x86 processor. The FX-51 is so flat-out quick in 3D gaming, one wonders whether the Pentium 4 3.2GHz Extreme Edition doesn’t exist just to save face for Intel. Were it not for the Extreme Edition’s copious amounts of L3 cache, the Athlon 64 FX-51 would nearly have run the tables in our gaming tests.

The P4 Extreme Edition does hold its own against the Athlon 64 FX, and you have to like Intel’s willingness to mine its Xeon line for extra desktop performance. I am a little surprised by the breadth of the benchmarks in which the Extreme Edition’s massive amounts of on-chip cache improve performance over the stock Pentium 4, especially the games. When you can practically load Quake III into cache and execute it, though, good things are bound to happen. Let’s hope Intel follows through with sufficient volumes and somewhat reasonable pricing on the P4 Extreme Edition. It shouldn’t cost a penny more than the Athlon 64 FX-51, especially because the Extreme Edition seems to heat up our test labs noticeably more than any other CPU we’ve tested. That’s just a seat-of-my-pants evaluation, but I swear, the seat of my pants got pretty sweaty.

For those of us with more pedestrian spending limits, the Athlon 64 3200+ looks like a great value. Yes, it costs over 400 bucks, but the stock Pentium 4 3.2GHz is selling for more than $600 right now. The Athlon 64 3200+ maybe trails the P4 3.2GHz in overall performance by the thinnest of margins, but no way is the P4 worth another $150 to $200. And that’s without considering the 64-bit question.

In fact, we’ve barely scratched the surface of the 64-bit issue beyond confirming that the Windows 64-bit pre-beta seems to run 32-bit code reasonably well. AMD supplied some 64-bit test apps with the Athlon 64 FX-51 review system, but I’m afraid we spent too much time investigating new graphics chips to devote proper attention to the Athlon 64’s AMD64 extensions. We’ll have to look at that in a future article. Of course, the true test of 64-bit performance will come with a release OS and real 64-bit applications, assuming they become available. AMD seems to be making all the right moves to garner support for AMD64, but this is new territory. We’re all wondering how successful AMD’s 64-bit initiative will be, and only time will tell.

All in all, Hammer translates surprisingly well to the desktop. That didn’t seem like a foregone conclusion when the first Opterons arrived this past spring at lower clock frequencies, but the Hammer core scales exceedingly well with clock speed. So long as AMD can ramp up supply of Athlon 64 chips at a decent pace and keep raising clock speeds to counter Intel’s upcoming Prescott core, it looks like a winner. 

Comments closed
    • lyc
    • 12 years ago

    oh, what great days these were for amd… things in november 2007 aren’t quite so rosy :/

      • wingless
      • 11 years ago

      LOL! We have to wait until ~November 2008 to see if Deneb redeems AMD too. I hope Bulldozer smashes the competition like Hammer did back in those days.

    • ganz
    • 16 years ago

    Am I missing something?

    Why doesn’t the Opteron 146 beat hell out of the Athlon 64? The Opteron is a sledge hammer (i.e. a chip with a 128-bit dual channel to main memory) running at 1.8 GHz, while the Athlon 64 3200+ is a claw hammer (i.e. a chip with a 64-bit single (?) channel to main memory) running at 1.6 GHz.

    • NYU Arab
    • 16 years ago

    According to the inq, the p4ee is smp capable… AND significantly cheaper than its xeon counterpart.

    Β§[<http://www.theinquirer.net/?article=11773<]Β§

      • Anonymous
      • 16 years ago

      That would be great, but who makes an SMP 478-pin board?

      • Krogoth
      • 16 years ago

      Maybe in design but, I doubt there’s is going to be chipset that can support two Skt478 chips. Since the market for it is far too small to make it worthwhile for any chipset manufacturer. I also think the the extra pins found on the regular P4 Xeon weren’t just for market segmentation they were also needed to interconnect the P4 Xeons.

      edit: spelling πŸ™

    • Anonymous
    • 16 years ago

    seeing as how micro$oft has yet to deliver a 64 bit os, why not try doing some benchmarks with one of the several 64 bit linux varieties? especially since amd is aiming towards the enthusiest/power user market. it would be interesting to see how the different chips perform on it.

    • Anonymous
    • 16 years ago

    READ THIS! THIS IS FUNNY! Β§[<http://www.tomshardware.com/cpu/20030923/athlon_64-22.html<]Β§

    • Anonymous
    • 16 years ago

    Β§[<http://www.forbes.com/2003/09/23/cx_ah_0923tentech.html<]Β§ I still don't believe Intel could or would do 64 bits in Prescott. Aside from proving themselves liars, it would take a lot more work and result in a chip that couldn't fit in the first iteration's Socket-478 package.

    • Anonymous
    • 16 years ago

    Is there not a 64bit version of Unreal tournament 2003 that would be interesting to test?

      • --k
      • 16 years ago

      y[http://www.xbitlabs.com/articles/cpu/display/athlon64-fx51_14.html<]Β§

    • --k
    • 16 years ago

    Good article Scott I liked the part were you broke down the features of the A64. AMD wishes their literature was so concise.

    I found a plugin for POV Ray that supports multithreaded rendering through SMP/HT. I haven’t checked it out yet, but it’s freeware so there no risk in trying.

    Β§[<http://www.it-berater.org/smpov.htm<]Β§ Also found a version of POV optimized for Athlon XPs. Β§[<http://speedycpu.dyndns.org/opt/<]Β§

    • droopy1592
    • 16 years ago

    How long will socket 754 last?

      • Anonymous
      • 16 years ago

      2004, and then it dies from what I’ve seen somewhere…

    • Anonymous
    • 16 years ago

    There’s no big difference between the FXs and Athlon64 in terms of performance. Just see Anantech review.

    Β§[<http://www.anandtech.com/cpu/showdoc.html?i=1884<]Β§ He has an Athlon 64(2.0GHz running @ 2.2GHz) and the FX-51 outperforms the A64@2.2GHz only by a hair.

    However this may change when the FXs goes to 938 pins with non-ECC DIMMs

    • muyuubyou
    • 16 years ago

    Is Apple’s powermac G5 “the world’s fastest Personal Computer”? πŸ˜€

    Β§[<http://www.apple.com/powermac/<]Β§

    • Anonymous
    • 16 years ago

    And what do yu suppose the price tag for this Intel P4 with 2MB cache will be?
    Let’s see: The XEON version runs around FOUR THOUSAND DOLLARS right now!

    Ridiculous.

      • Anonymous
      • 16 years ago

      i dont remember but i htink its around $800

      • Anonymous
      • 16 years ago

      Where are you getting your prices? The biggest cached Xeon 3.06’s (1 MB) that I can find for sale online are going for under $750.

        • axeman
        • 16 years ago

        Maybe he’s talking about Xeon MP’s which are for 4-way and greater systems, much pricier.

          • axeman
          • 16 years ago

          In fact, the p4 EE is based upon the Xeon MP, the plain Xeon or Xeon DP doesn’t have L3 cache, its just a P4 that supports dual processors. Right now a Xeon MP 2.8ghz with 2MB of L3 is going for around 5K near as I can tell.

            • Anonymous
            • 16 years ago

            Now I’m thoroughly confused. I thought the Xeon DP was a cheaper Xeon with only dual processor support. What are the non-DP ones that are, according to what you’re all saying, not MP’s either?

            • axeman
            • 16 years ago

            I think the standard Xeon IS a Xeon DP, there are basically the DP and the MP and nothing else. I could be wrong though.

    • Anonymous
    • 16 years ago

    Some of the source code is available on the sciencemark.de site; the .org site is not a perfect copy yet :(.

    • Anonymous
    • 16 years ago

    only one sciencemark benchmark has source code listed so how can you say they are ‘publishing much of the source code’ ??!?
    I sciencemuck 2.0 is more like it.

    • Anonymous
    • 16 years ago

    Is it just me or is anyone dissapointed with the 64/3200+ performance!?!

      • Anonymous
      • 16 years ago

      Well on the one hand it IS as fast as/faster than a 3.2 ghz P4C. On the otherhand I guess I b[

        • daniel4
        • 16 years ago

        Well we already know that there will be an Athlon64 3400+ launched later on and maybe even a 3600+ come Prescott launch day so it’s not over yet.

    • albundy
    • 16 years ago

    #19, yes and no. it seems that the cache would make it so, but you wont find any boards that support more that 1 P4, so if you need a quad setup, you must buy xeon.

    I would like to see performace in 64bit apps. I know few are available if any, but I am sure it will raise the 64fx numbers quite a bit, since it seems to be almost on par with the p4 3.2ee/non-ee that is going for $150 less. its like running windows 3.1 on a pentium machine.

    • Anonymous
    • 16 years ago

    Does anyone know anything about the idle/full load temps of the FX/athlon 64?? Nothing is mentioned in TR review which is otherwise quite good…

      • Anonymous
      • 16 years ago

      sorry, i cant remember where i saw them but i think a couple review sites had temps in the mid to high 30’s, maybe going into the 40’s but all i remember was that cooling wasnt much of an issue, but o/cing was guessed to be related to the “core limits”…

        • Anonymous
        • 16 years ago

        it is a pity they don’t mention such an important issue because if it runs at 38-42 full load, it is an indication that it has a lot of headroom…

          • Pete
          • 16 years ago

          Ace’s got their FX-51 up to 2.8GHz–with a Prometia cooler and the CPU at -39 degrees C. πŸ™‚

            • Anonymous
            • 16 years ago

            x86-secret.com got their P4EE to 4.372GHz with vapochill. And the Athlon 64 3200+ had much less overclocking headroom than the FX.

      • Anonymous
      • 16 years ago

      I recently tested the Athlon64 3200+ and I can tell you it ran very cool. The heatsink was “warm” to the touch not hot at all. I know how shitty this sounds, but I was unable to test for accurate temp readings. The heatsinks being produced are set to handle 89watts of heat yet they produce well under that. If I had to guess I would say 35C was the temp and 40C under load.

    • Kurlon
    • 16 years ago

    Hrmmm… my dual OC’d Xeon is starting to feel a little threatened, but primarily ’cause I’m pushing a Matrox G400 (fans died on my GF 2Ti and GF 2MX) not due to cpu perf. Hafta see where we are at in 6months, may upgrade then.

      • indeego
      • 16 years ago

      heh… you o/c’ed a Xeon? Good atheist’s-equivalent-of-a-lord whyg{

        • Kurlon
        • 16 years ago

        Well… I was making sure my soldering iron worked, when I accidently loaded it up with solder and voltmodded my motherboard. Now this was completely by accident I assure you.

        Ok, one small honest mistake, it can happen to anyone. This not being my day however I also proceeded to slip and bump up from a 400mhz FSB to a 533mhz FSB. I think I blacked out about here. When I came to I had determined the highest mult I could run stabily at.

        Gotta watch out for those freak accidents.

        The real reason, I use 7 specific filters in Photoshop and was sick of my friends Mac SE beating me in that benchmark.

        (Basically, ’cause I can. My friend just built a 3ghz P4 C box, I had to keep up.)

    • sativa
    • 16 years ago

    well i have $550 to blow right now and i think i’ll be getting a 16ms 17″ LCD + some random goodies with the leftover money rather than upgrade my box just yet.

    i’ll wait for the prices of these things to drop a little bit first.
    my 933 p3 is getting old lol

      • Anonymous
      • 16 years ago

      AMD’s site quotes a $278 3000+

      sites are guessing that one is due out soon, granted its a “desktop replacement” chip, but i dont see why a bios update can support it ona normal mobo

      that is my hope at grabbing an afforable ship (meaning below $300) within half a year

    • d0g_p00p
    • 16 years ago

    It’s a shame that AMD cannot ramp up speeds to match the P4. Imagine how devastating to Intel if AMD was able to put out a 3.2Ghz Athlon64 for the same price as a P4 3.2Ghz. All one can only hope.

      • Anonymous
      • 16 years ago

      No, SSE can’t use the 80 bit internal representation that the x87 processor can use. However, a 64-bit floating point representation is what most other CPUs use, so that should not necessarily be held against SSE.

    • Mr Bill
    • 16 years ago

    A very well done review. Took my whole lunch hour to read it. I’m very impressed by AMD’s line up. The P4 sure has legs in vectorized SSE. But does SSE give the same math precision for scientific computation, anybody? I wonder if doing singular value decomposition to a given precision would still have the P4 winning for larger arrays.

      • Mr Bill
      • 16 years ago

      SSE2 has lower precision than IEE. I found the answer here…
      Β§[<http://www.aceshardware.com/forum?read=65029682<]Β§ Guess it does not matter for most applications. Spec Mark must not require better than 64-bit precision.

        • Anonymous
        • 16 years ago

        64 bit is the standard for double precision. 80 bit was an aberration.

    • Anonymous
    • 16 years ago

    From Aces:

    ” These boards are faced with the limitations of a typical DDR chipset: support for at most 2 GB (K8T800) or 3 GB (nForce 150) and it is likely that inserting a third DIMM throttles DDR400 back to 333 MHz, as indicated by MSI’s manual. ”

    TR usd 3 dimms… ;o

    Wonder if tthis is the case or not….DAMAGE you out there ? πŸ™‚

      • Damage
      • 16 years ago

      Not so in the case of our Athlon 64 3200+ board. You can see the Sandra scores yourself. We had three DIMMs installed, and we got 3003MB/s of bandwidth–well above PC2700 territory. The MSI manual for that board said only “approved” DIMMs would run at 400MHz in the third slot. Our Corsair XMS3200 seems to have managed fine.

        • Anonymous
        • 16 years ago

        good good πŸ™‚

    • just brew it!
    • 16 years ago

    Hey, so when does 64-bit Windows for the Athlon 64 officially launch?

    I mean, Linux is all well and good… but most people — gamers especially — are still gonna want their Windows…

      • eitje
      • 16 years ago

      good thing the 32-bit version appears to do pretty well still on the A64s. πŸ™‚

        • droopy1592
        • 16 years ago

        Too bad we can’t make Dual Athlon FX boxes. I’d plunk down 2 large to make one of those.

          • Anonymous
          • 16 years ago

          you mean dual opteron???

      • indeego
      • 16 years ago

      No official date. It will not be released Gold earlier than Q3′ 04g{<.<}g

        • Krogoth
        • 16 years ago

        I don’t really think the A64 will be able to support SMP. On the other hand prehaps the A64-FX can be hacked to go SMP since it’s just a rebadge Opetron. If you want to go SMP without hacks you have to get Opetrons.

    • Anonymous
    • 16 years ago

    ok, I read the comments and looked at their website, so where’s the source code for sciencemark?

      • eitje
      • 16 years ago

      it would appear to be in the “Benchmark Docs” section.
      Specifically, the Membench doc has a lot of asm in it. i imagine the Moldyn section would as well, if it weren’t for it not having a link from page 4/9 to page 5/9. πŸ™‚

      also, someone needs to tell them that there’s no dash in tech report, dagnabbit!

      and that only about half of their website actually seems to work. πŸ˜‰

    • Zenith
    • 16 years ago

    Tom knows if he put the FX to 2.4 or higher, intel wouldn’t have a chance. That site makes me sick.

    • Captain Ned
    • 16 years ago

    Does this mean that Intel may finally spring Yamhill on us?

      • droopy1592
      • 16 years ago

      Prescott has over 730 pins and built in 64bit. Something about dual integer cores.

      • p645n
      • 16 years ago

      Go to this NY Times article; Β§[<http://tinyurl.com/oeud<]Β§ for "Microsoft Corp. on Tuesday unveiled a version of its Windows XP operating system for Advanced Micro Devices Inc.'s new Athlon 64 processor and said the system would also work with a 64-bit desktop chip from Intel Corp. if the company develops one.".

    • wagsbags
    • 16 years ago

    What I’m finding interesting is how much performance has increased with how little clock speed increase. My 2.4B is getting crushed by huge margins by the top end processors even though they only have a 30% clock speed advantage.

    • daniel4
    • 16 years ago

    Is it me or does the Athlon64 3200+ outperform the P4 3.2Ghz by a larger margin when a R9800 Pro is used in both systems? Dunno just looking at the reviews that use that chip and the Athlon64 3200+ shines a little bit more in those than in the ones that use the 5900 Ultra.

    • indeego
    • 16 years ago

    q[

      • wagsbags
      • 16 years ago

      Dude he wasn’t making a comment about the price. Almost twice a performance difference with that little ghz and time of release difference is pretty good.

        • Anonymous
        • 16 years ago

        And not to forget that your processor will keep it’s value for much longer because of software that will be optimized for it in the future. 64bit is the way to go anyway.

        Can’t wait to see UT2K3 en HL-2 64bit benchmarks. And I expect MS to add 64bit drivers in WHQL validation. That’ll make sure that all hardware will have solid 64bit drivers to drive the softwares to their max ! We all have seen what optimized code can do for the Pentium 4’s, I’m very curious about what optimization can do for the A64FX

        I’ve heard about max. 30% speed increase in UT2k3 on 64bit hardware.

      • Krogoth
      • 16 years ago

      Well, I see you point that the a system FX-51 price isn’t jusify when you can get a Barton 2500+ system and OC it to 3000+ speeds for 1/3 of a price yet get 80% of a FX-51 performance. But you seemed to forget that going to the bleeding edge. You always have to pay a hefty pretium especially on a new releases. Intel isn’t exactly innocent on this account ether, you still have you pay quite a bit amount to get a P4 3.2C system. Eventhough, Intel is very likely to do a price cut on these processors. I’m not going to even go and mention what is was like back in Intel’s near-monopoly days.

      • eitje
      • 16 years ago

      he was talking about the performance delta, not the price/performance delta. πŸ™‚

      • Anonymous
      • 16 years ago

      Hi! We’re talking about absolute performance, not price.

    • Alex
    • 16 years ago

    Damage, nice review, good work. πŸ˜‰

    FX looks pretty darn good…

    • BigMadDrongo
    • 16 years ago

    The top few systems are probably almost twice as fast as my current machine, in pretty much every test. Been a while since that happened…
    (This comp isn’t exactly ancient after all – little over a year old, and while it wasn’t bleeding-edge even then it was pretty damn good.)

      • John S
      • 16 years ago

      The question is, do you feel the need to play catch up? The performance might be insane, but these days you can do damn fine without spending top dollar (except on the video front).

      I’d actually suspect that AMD might want to increase the cost of the top of the line to offset some shift in buyer’s habits toward the lower end.

        • indeego
        • 16 years ago

        9600’s are $120-$140. Hardly call that top dollar, and give you fantastic performance.

        I’m a cheapo still and think I’ll wait for them to hit $100 shipped. HL2 ready or notg{<.<}g

        • Anonymous
        • 16 years ago

        My real problem nowadays is not processor speed, but disk quality. It has become very difficult to find a floppy drive or a hard disk which offers the same quality as you could find them ten years ago. I actually still have a 3.5″ drive runnig, which dates from 1991. It’s the oldest drive I have, and ever had. The others are younger, and all replace drives which never reached the age of that oldy.

    • daniel4
    • 16 years ago

    Newegg has the Athlon64 listed @ $449 with the lowest full ATX motherboard being listed @ $141!

    • Anonymous
    • 16 years ago

    What?? NO overclocking tests? Especially amid all the rumors of the FX-51 being available with an unlocked multiplier at some point? Your last round of CPU tests had no OC’ing either.

    Are we seeing a disturbing new trend in TR tests? Should TR be renamed PR (Pedestrian Report), “Stock Computing Explored?” Have Damage & Co. aged from “wild and crazy geeks” to “we’re scared it might blow up?”

    /[

      • BooTs
      • 16 years ago

      Its The Tech Report, not the Miscellaneous Geeky News.

      Damage wrote an excellent Technical Report, regarding CPU performance across several platforms. Not the benefits of trying to overclock the lastest chip. The Athlon 64 platform wont live or die by its overlcocking performance, but by its stock performance, ability to compete with high-end Intel products, and its ability to transition the server market to a 64-bit ISA.

      Across the spectrum of things the Review could have covered, overclocking potential is pretty irrelevant.

      <3 Damage.

    • Anonymous
    • 16 years ago

    I wonder how the AMD chips would have faired in Quake III with the AMD driver that guy made who claimed the 3dnow crap was broke in it and let it use SSE instead. Notice how AMD came out well in ET though? Perhaps they fixed the problem in there…

      • Anonymous
      • 16 years ago

      The fixed .dll was found to not do anything to SSE or 3dnow at all. It gave the same increase in performance in both AMD and Intel computers

    • Anonymous
    • 16 years ago

    Where is AMD’s hammer chipset? The 8000 or something…

    • Anonymous
    • 16 years ago

    q[intended to establishing ] ?

    Shouldn’t it be “intended for establishing” or “intended to establish” – ?

    • Anonymous
    • 16 years ago
    • lovswr
    • 16 years ago

    Kudos, to Damage & AMD. Now with the apparent resurgence of competition in the GPU market, I may be able to afford a complete system upgrade this year after all πŸ™‚

    • LicketySplit
    • 16 years ago

    Darn…smileys wont work…wonder if the finger would!

    • LicketySplit
    • 16 years ago

    Just took a peek at intel shill Tom’s….he paints an all intel picture…heh? Stone him i say:)

      • droopy1592
      • 16 years ago

      Tomshardware is throwing in 3.6Ghz overclocks for some reason.

      What’s the purpose?

      I think he should be stoned for overclocking the P4EE and not the AthlonFx 51. I’ll guess he’ll do anything to get the benches in Intel’s favor. He kills me…

        • indeego
        • 16 years ago

        My theory is after the Intel vs Amd fanless burning article Tom gets his pick of the litter on processors. Sure, Tom, have this 3.4 capable *cough* I mean stock 3.2, *wink wink*.

        Pick o the litter.

        I wouldn’t put ANYTHING past Mr. Pabstg{<.<}g

        • Autonomous Gerbil
        • 16 years ago

        He took those out, citing the fact that they were supposed to be for a later article. He must have taken too much heat for rigging the test so that Intel came out clearly on top.

          • Pettytheft
          • 16 years ago

          I agree he shouldnt’ have provided the overclocked Intel scores. But if you read each caption it never mentions the overclocked processors. It always compares it to the 3.2Ghz chip. His conclusion seems to be on par with the rest of the sites. I dunno, maybe it’s just that I remember when Toms was in AMD’s pocket.

            • wagsbags
            • 16 years ago

            there’s an update on the page that says it was a mistake.

      • Anonymous
      • 16 years ago

      For those who think Tom is biased for Intel, check this little historical reminder:

      Β§[<http://www20.tomshardware.com/cpu/200008282/pentiumiii-02.html<]Β§

        • Yahoolian
        • 16 years ago

        Times change, biases change. Live in the present.

    • IntelMole
    • 16 years ago

    By the way Damage, could your girlfriend get into tighter leather?

    (If so, can I have the pictures?)

    :-D,
    -Mole

      • Damage
      • 16 years ago

      Wrong article, wrong author. πŸ™‚

        • eitje
        • 16 years ago

        *shrug* i doubt any of us would discriminate about seeing a TR lady in even tighter leather. πŸ˜‰

        hmmm… new TR merchandise….

        • IntelMole
        • 16 years ago

        <dumb voice>Duh, now I feel stupid</dv>

        πŸ˜€

        Damn, that was a good little post as well, now the joke is ruined πŸ˜›

        Maybe if I post it quickly… no :-D,
        -Mole

    • LicketySplit
    • 16 years ago

    Impressive is the word…great job damage…looks like VIA finally got it rite eh? Just needed to see a big billy club facsimilie in the background…:)

    • daniel4
    • 16 years ago

    I’m also wondering if the comment supposedly made by an Intel representative is true and 3.2Ghz P4EE will offer higher performance than a 3.2Ghz Prescott as far as gaming is concerned. If the P4EE does indeed jump to 3.4Ghz by year end and Prescott will debut at 3.4Ghz at year end and jump to 3.6Ghz in late Q1’04 it looks like a Athlon64 FX51/53 should be enough to stave off even a 3.6Ghz Prescott. That certainly bodes well for AMD!

      • Hockster
      • 16 years ago

      I also thought about that, as my initial thoughts were that if the P4EE is this fast then the Prescott will destroy the Athlon 64 in everything. But of course I doubt the Prescott will be that much faster than the P4EE…or will it? πŸ™‚

        • Krogoth
        • 16 years ago

        It’s rumored that the reason for the Prescotts delay is that Intel may incorprate the X86-64 architech into the Prescotts for marketing’s sake. Even that said I think that the Prescott’s biggest obstacle is thermal issues the inital steppings will generate at least 100W of heat. Which is the close to peak heat production of the latest Northwoods with HT enabled. Until the BTX form factor becomes mainstream the Prescotts will not able to scale very well at all.

          • droopy1592
          • 16 years ago

          Prescott has dual integer cores, and with SSE2 it’s 64 bit capable, but without the addressing. It’s not quite x86-64. It’s something different and Microsoft has talked of support 4-5 64bit platforms.

            • Anonymous
            • 16 years ago

            Not necessarily. There’s a fair amount of evidence around that it has the same x86-64 capabilities as the Hammer, along with the addressing. Intel hasn’t provided a whole lot of information regarding the Prescott, it’s much too early to say what and what not it actually is.

            Microsoft’s statement is far too vague to read too much into it. For example, by platforms do they mean different architectures or slightly different designs by different manufacturers?

            All we can really do is wait until Prescott is released or some more solid information on it is leaked/released from Intel.

            • droopy1592
            • 16 years ago

            The “evidence” shoes dual integer cores, which isn’t exactly the same as x86-64.

    • danny e.
    • 16 years ago

    am i the only one pissed that Intel is getting away with stealing a little of AMD’s thunder with a PAPER-LAUNCH. the “extreme-edition” P4 is a XEON … that is not available and probably never will be.

    i am saddened that TR chose to give them any press at all.

    – danny e.

      • IntelMole
      • 16 years ago

      Yes.

      πŸ˜›

      You’re the only one that cares at least. More processors from either company = comptetition, at least now that AMD have a processor that can compete.

      Me, I’m waiting a month or so for the prices to drop on these babies, and for AMD to clear out it stock of “just about make a little higher than 3200+” processors, then I’ll grab two and put ’em up to (3400+)+ speeds … if that makes sense πŸ˜€

      Fantastic,
      -Mole

        • danny e.
        • 16 years ago

        i wouldnt be upset if the “extreme edition” P4 were a real product…. but Intel is just pulling an nVidia here.

        remember whent he Radeon9700 came out…. nVidia was like… “dont buy that… our product is just around the corner and is better and faster” .. and 90% of everyone fell for it??

        history repeats itself with different players ….

        – danny e.

          • IntelMole
          • 16 years ago

          I thought it was “those who forget history are doomed to repeat it”

          Anyways…

          DaveJB, I’m not bashing Intel πŸ˜€

          Anyways, they’re only paper-launching it by two months, not the God-knows-how-long it was for Nvidia… 6 months?
          -Mole

      • Krogoth
      • 16 years ago

      Well, that’s not entirely accurate a P4 EE is really just a Xeon that failed Intel’s quality standards in a SMP enviorment. Instead of just recycling the Xeon the clever marketing people at Intel decide to use it to steal the some thunder from the A64 FX. At the same time preventing the erodsion of P4s brand name in the marketplace. So P4 EE will probably end up being only avaiable in a very limited quanity and of course expensive. My advice for Intel folk is just to hold out and wait for the Prescotts to get in the game.

      • leor
      • 16 years ago

      intel will most likely be losing money on the p4 EE, as it goes for over 2,000 dollars as a xeon. with the amount of wafer space taken up by that huge die, it’s clearly just a desperate attempt to stay competitive since prescott is still a ways off. i’ll be very surprised if it retails for under 800 dollars.

      i expect intel to lag behind quite a bit until their .09 process matures and amd starts having their .09 problems (unless IBM helps them through it)

        • Anonymous
        • 16 years ago

        I somehow doubt Intel is losing money on every P4EE sold. The price of Xeons are set that high because of the market Intel sells to. Remember, Intel’s margins are extremely high for the Xeon which is exactly why AMD wants to get into that market with Opteron. But regardless, I speculate (complete speculation with no quantitative evidence) Intel’s margins are not even close to 50% which is Intel’s average percentage profit per processor.

          • WaltC
          • 16 years ago

          I agree Intel isn’t losing money with the P4EE, because Intel isn’t selling any…:) P4EE is a new class of cpu called “review ware”….:)

            • DaveJB
            • 16 years ago

            I find it amusing that AMD fanboys are bashing Intel for “paper launching” the P4EE, considering that AMD did the same with the 2400+, 2600+, and 2800+.

            • WaltC
            • 16 years ago

            Heh…:) One doesn’t have to be a “fanboy” to understand the AFX’s are in the channel now and available, and the P4EE is not. Most of Intel’s P4 launches of the last few years have been “paper,” too. What’s remarkable about the AFX launch, and different from most of the Athlon/P4 launches of the last couple of *years* is the fact that the AFX was in the channel and available for purchase on the day of the launch. We haven’t seen that for a long time…from anybody, including Intel (Years ago, Intel never launched without product in the channel–but that’s not been true for quite sometime.)

            What’s intriguing to me here is that something’s definitely up with Prescott–just don’t know what–otherwise sending out overclocked Xeons with 2mb L3’s to selected web sites well ahead of even OEM availability, along with NDA’s timed to expire on the day of the AFX launch, would not be happening.

            As some other people have termed it, “EE” stands for “emergency edition”…:)

            • Anonymous
            • 16 years ago

            Another way to look at it is like this:

            If these chips “are” failed SMP Xeons, doesn’t it make sense to make SOMETHING off of it as opposed to totally eating it’s production costs?

            Of course I don’t know what kind of recycling facility Intel to deal with failed chips, but this way the chip is already made so why not sell it for a lower price (even if it’s very close to cost).

            Then again I may be barking the wrong tree.. πŸ˜‰

            Peace,
            Kevin

            legionosh at msn.com

            • DaveJB
            • 16 years ago

            I assume that if Intel can convert failed P4s into Celerons, it should be able to do the same with failed Xeons.

            Mind you, I reckon that P4EEs are really speed-binned Xeon MPs, considering that the Gallatin core is only designed for a 533MHz FSB and sub-3GHz speeds.

            • Anonymous
            • 16 years ago

            No office applications test for the AMD64’s?
            Not relevant anymore?

            • Pete
            • 16 years ago

            Who’s going to buy a $1000+ CPU+RAM combo for office apps? How fast do you need to render that paper clip, anyway? πŸ˜‰

    • daniel4
    • 16 years ago

    What’s cool is that the Athlon64 FX forced Intel to paper launch a product, which is something that we haven’t seen for a long time now from them. Someone mentioned November availability at the earliest? That’s pretty far off, but it’s nice to see a comparison nonetheless. Also yeah someone mentioned something about the FX 53 coming out sooner than later because of the P4 EE.

    • leor
    • 16 years ago

    the FX manhandles the p4 3.2 ghz, only the EE can compete, and i wouldn’t be surprised if the FX got a speed bump (3.4?) by november

    • eitje
    • 16 years ago

    Excellent review, guys.

    I can’t say that my socks have been knocked off by the 64-FX-51-hut-hut’s performance, but that’s mostly because I was lucky enough to be wearing shoes at the time of reading this article. Certainly, I agree that it was VERY wise for Intel to release their P4EE, because otherwise this announcement would have probably been amazing. my shoes (and socks) would be across the room, if not out the window.

    It looks like Hector listened to the open letter i posted to him in comments a few weeks ago; the 2/3rds price difference between the A64 3200+ and the P4 3.2 (non-EE) looks very good to me. πŸ™‚

    given that you were working with either a 32-bit OS, or a 64-bit pre-beta OS with non-compliant drivers, I’m relieved to say that at least AMD is still in the game – today was make or break day for them, and (in my opinion) they managed to do neither. πŸ˜‰

    Thanks again, D, for getting this done!

    ps – TR server seems to be handling the release-day load well.

    • Rousterfar
    • 16 years ago

    woot!!!

    Finally here.

    • Nelliesboo
    • 16 years ago

    Great Review, but as good as the FX is it just wins it doesnt slap the P4 around to much, just wins which isnt bad, but the new p4 should be alot better than the current one, i guess its a wait and see, also they are pricing them the same? yea right, what about video encoding??? when they fix that maybe…

    • Anonymous
    • 16 years ago

    I thought that was really great when the Athlon Xp 3200+ beat out everyone in ScienceMark 2.0 Beta. Still shows some life left in the Xp line.

    • Anonymous
    • 16 years ago

    Great review πŸ™‚ The ECC/Registered PC3200 DDR seems to be $100 cheaper this week then last, odd… $180 for 512megs at newegg….

    Β§[<http://www.newegg.com/app/viewproduct.asp?description=20-146-211<]Β§

      • daniel4
      • 16 years ago

      Corsair, Kingston, Samsung, and OCZ are ready to roll out retail DDR400 registered memory chips and thus that’s probably why the Mushkin stick dropped so dramatically. One store started selling the Corsair sticks about a week ago, listing them for the same $180 price and I am guessing they will drop as demand increases.

    • Anonymous
    • 16 years ago

    Damage!
    Thanks very much for including Lightwave in this review.

    • Hockster
    • 16 years ago

    Nice review, and very nice processors too. Even though the spotlight is meant to be on AMD’s new chip here, I can’t help but mention how much of a improvement the Pentium 4 3.2GHz Extreme Edition processor has made over the older one – that Quake III score is insane!

    • Anonymous
    • 16 years ago

    Damage, why did you use the GeForce FX 5900 Ultra for the tests, instead of the Radeon 9800 Pro?

    Just wondering.

      • Damage
      • 16 years ago

      Had 64-bit drivers for the GeForce.

    • Captain Ned
    • 16 years ago

    Now that I’ve read it, all I’ve got to say is that Intel really needs the Prescott to be a world-beater. As the A64 platform matures and 64-bit apps come on line, Intel’s going to need help.

    Now, about that Vermont crack. I dare say that my fair state is well in excess of 237 mm^2. Why, we’re up to 300 mm^2 at least. πŸ˜‰

      • IntelMole
      • 16 years ago

      Imagine the heat generated by a Prescott Extreme Edition!

      Now THAT’S scary! πŸ˜€

      All in all, it looks like I’ll be running an FX-51 duallie then – if only…

      But the clockspeed increase on these beasts should be impressive considering the heat output at the moment…

      Imagine a decent 90nm implementation of the hammer core… can you say “overclock?!?!?!”

      btw Damage, what voltage were these chips at? (P4EE and A64s)…
      -Mole

    • Anonymous
    • 16 years ago

    Very thorough article, nice job. What kind of wattage does the 64 3200 generate?

      • Pete
      • 16 years ago

      Great review! I’m also curious about the wattage comparison.

        • Anonymous
        • 16 years ago

        I read both 64Bit Athlons are 89w. The P4ee is 93w. P4 3.2 is 82w.

          • axeman
          • 16 years ago

          aceshardware, which I consider to be quite reputable said that AMD’s specs call for all Athlon 64 thermal solutions to be capable of dissipating up to 89w of heat, but that the exact specs for the CPUs just released are unknown as yet. At least AMD got their act together by getting a heatspreader/on-die thermal protection/infinitely better HSF retention mechanism.

            • droopy1592
            • 16 years ago

            It’s more like 70 watts. I don’t know about the FX series though.

            #122 from the forbes article:

            y[

          • Pete
          • 16 years ago

          Well, I believe it was Anand who made it quite clear that though AMD specs current A64 HSF to dissipate 89W, he doubts the CPUs will get close to needing such effectiveness at the current process. So I expect it to be considerably less than 89W. 70W sounds reasonable consider the 1MB L2, I suppose.

    • Anonymous
    • 16 years ago

    course Intel’s EE is ment to spoil AMD’s launch.. what is surprising is how well the Athlon 64 does against it, considering it’s a new product.. and has “new” chipset drivers and probably “new” Motherboard drivers.. And the P4 is the more mature plateform.. Also the EE i thought wouldn’t be out till November?

    • leor
    • 16 years ago

    guess it was noon then . . .

    • Captain Ned
    • 16 years ago

    FP. Now I’ll read the article.

Pin It on Pinterest

Share This